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functions  are  carried  out  mostly  in  the  database  command  and  control  processor 
(DBCCP)  and  the  security  filter  processor  (SFP) , the  report  will  be  devoted 
largely  to  these  two  components  of  DBC.w 

DBCCP  is  responsible  for  regulating  the  entire  operation  of  DBC,  and  for 
interfacing  with  the  front-end  computer  systems.  In  the  process  of  enforcing 
security  and  clustering  records,  DBCCP  requires  the  use  of  a number  of  tables  and 
lists.  In  order  to  speed  up  the  table  search,  a subcomponent  of  SFP,  namely  the 
security  and  clustering  unit  (SCU) , is  presented  with  some  details.  SCU 
consists  of  a large  number  of  parallel  memory-processor  pairs  organized  in  a 
manner  similar  to  that  in  the  structure  memory  (SM). 

The  entire  access  control  mechanism  revolves  around  the  important  notion  of 
security  atoms.  Briefly,  this  involves  the  protection  of  aggregates  of  data 
units,  where  the  aggregates  are  definable  by  the  file  creator  in  terms  of  the 
data  contents. 

DBC  can  also  protect  data  at  the  field  (or  attribute)  level.  The  user  of 
a file  stored  in  DBC  submits  a list  of  fields  that  he  wishes  to  access  along  with 
an  access  request.  DBC  can  then  check,  from  the  tables  that  it  maintains, 
whether  the  requested  access  on  this  combination  of  fields  is  allowed.  If  not, 
the  request  is  rejected  by  DBC.  Otherwise,  DBC  performs  the  required  access. 

If  the  access  is  a 'retrieve'  request,  then  the  relevant  records  are  retrieved 
from  the  mass  memory  (MM),  which  is  the  repository  of  the  database,  and  sent  to 
the  post  processor  (PP) , a subcomponent  of  SFP.  PP  then  extracts,  from  these 
records,  those  fields  that  were  requested  by  the  user  and  sends  them  onto  the 
front-end  computer  from  which  the  user's  request  originated.  The  remainder  of 
the  fields  in  the  retrieved  records  are  discarded.  Thus,  access  imprecision 
never  results  in  the  sense  that  no  retrieved  record  has  to  be  entirely  discarded. 
Only  parts  of  retrieved  records  may  have  to  be  masked  out. 

Since  MM  consists  of  moving-head  disks  and  since  the  principal  MM  operation 
is  searching,  we  would  like  to  limit  the  search  space  in  MM.  This  is  done  by 
using  a clustering  mechanism.  The  clustering  mechanism  of  DBC  also  uses 
information  provided  by  the  file  creator.  The  file  creator  must  have  some 
ideas  of  the  types  of  queries  that  will  be  used  to  retrieve  or  delete  records 
in  his  file.  Based  on  this  knowledge,  he  specifies  a set  of  clustering  attributes] 
Records  are  clustered  together  based  on  the  values  taken  by  these  clustering 
attributes  in  the  keywords  of  the  records.  With  this  mechanism,  it  is  estimated 
that  most  user  requests  will  require  access  to  only  one  cylinder  in  MM. 

Finally,  the  report  goes  into  the  details  of  the  tables  and  lists  needed 
to  enforce  security  ana  to  facilitate  clustering,  and  the  algorithms  needed  to 
support  these  mechanisms. 
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I.  BACKGROUND 


The  database  computer  (DBC)  is  a specialized  back-end  computer  which  is 
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capable  of  managing  data  of  10  bytes  in  size  and  supporting  known  data  models 
such  as  relational,  network,  hierarchical  and  attribute-based  models.  Any 
operation  of  DBC  is  concerned  with  one  of  the  following  four  aspects  - search- 
ing and  retrieval,  security,  clustering,  or  updating.  A number  of  papers  [ 1—6 3 
are  available  that  motivate  the  design  of  DBC  and  discuss  the  search  and  re- 
trieval aspects  of  DBC  in  some  detail.  The  reader  may  wish  to  refer  to  these 
papers  for  clarification  of  details.  A report  discussing  the  update  aspects 
of  DBC  is  scheduled  to  be  published  shortly.  What  we  intend  to  demonstrate  in 
this  report  is  how  the  security  and  clustering  functions  are  carried  out  in  DBC. 
Fundamental  to  our  discussion  is  an  understanding  of  the  built-in  data  model 
and  overall  architecture  of  DBC  which  are  dealt  with  in  the  following  sub-sections. 

1.1  DBC  Data  Model 

The  smallest  unit  of  data  in  DBC  is  a keyword  which  is  an  attribute-value 
pair,  where  the  attribute  may  represent  the  type,  quality,  or  characteristic 
of  the  value.  Information  is  stored  in  and  retrieved  from  DBC  in  terms  of 
records;  a record  is  made  up  of  a collection  of  keywords  and  a record  body. 

The  record  body  consists  of  a (possibly  empty)  string  of  characters  which  are 
not  used  for  search  purposes.  For  logical  reasons,  all  the  attributes  in  a 
record  are  required  to  be  distinct.  An  example  of  a record  is  shown  below: 

(<Relation,  EMP>,  <Job,  MGR> , <Dept,  TOY>,  <Salary,  30000>). 

The  record  consists  of  four  keywords.  The  value  of  the  attribute  Job,  for 
instance,  is  MGR. 

DBC  recognizes  several  types  of  keywords:  simple,  security  and  clustering. 
Simple  keywords  are  intended  for  search  and  retrieval  purposes.  Security  key- 
words are  used  for  access  control.  Clustering  keywords  are  utilized  for  placing 
records  with  a high  probability  of  being  retrieved  together  in  close  proximity. 
Security  and  clustering  keywords  will  be  elaborated  on  in  later  sections. 

A keyword  predicate,  or  simply,  predicate,  is  of  the  form  (attribute, 
relational  operator,  value).  A relational  operator  can  be  one  of  {=,  =/ , >, 

>,  <,  <}.  A keyword  K is  said  to  satisfy  a predicate  T if  the  attribute  of  K 
is  identical  to  the  attribute  in  T and  the  relation  specified  by  the  relational 
operator  of  T holds  between  the  value  of  K and  the  value  in  T.  For  example. 
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the  keyword  <Salary,  15000>  satisfies  the  predicate  (Salary>10000) . 

A query  conjunction  is  a conjunction  of  predicates.  An  example  of  a 
query  conjunction  is: 

(Salary>25000) A(Job^MGR) A(Relation=EMP) . 

A query  is  a Boolean  expression  of  predicates  in  the  disjunctive  normal 
form.  Thus,  a query  is  a disjunction  of  query  conjunctions.  An  example  of 
the  types  of  queries  that  may  be  recognized  by  DBC  is  as  follows : 

( (Dept=TOY) A(SalaryclOOOO) ) v( (Dept=BOOK) A (Salary>50000) ) . 

[ 

If  the  above  query  refers  to  a file  of  employees  of  a department  store, 
then  it  will  be  satisfied  by  records  of  employees  working  either  in  the  toy 
department  and  making  less  than  10,000,  or  working  in  the  book  department 
and  making  more  than  50,000.  Queries,  as  defined  above,  are  used  not  only 
to  search  and  retrieve  records  from  the  database,  but  also  to  specify  protec- 
tion requirements.  It  is  important  to  note  that  by  utilizing  queries  as  a 
means  for  security  specification,  any  retrievable  data  in  DBC  can  also  be 
protected  for  security. 

1.2  DBC  Architecture 

Figure  1 shows  the  schematic  architecture  of  DBC.  It  consists  of 
two  loops  of  memories  and  processors,  namely  the  structure  loop  and  the  data 
loop.  The  data  loop  is  composed  of  two  components:  the  mass  memory  (MM),  and 
the  security  filter  processor  (SFP).  MM  is  the  repository  of  the  database 
and  is  made  of  modified  moving-head  disks  where  all  the  tracks  of  a cylinder 
may  be  read  in  parallel  in  a single  disk  revolution.  This  modification  is  termed 
tracks-in-parallel-readout . In  addition,  the  mass  memory  of  DBC  is  content- 
addressable.  Given  a cylinder  number  and  a query  conjunction,  it  is  possible 
to  content-search  the  entire  cylinder  "on  the  fly"  in  the  same  revolution. 

For  a detailed  description  of  the  organization  of  MM,  the  reader  is  referred 
to  [8]. 

The  structure  loop  is  composed  of  four  components:  the  keyword  transformation 
unit  (KXU) , the  structure  memory  (SM) , the  structure  memory  information  processor 
(SMIP)  and  the  index  translation  unit  (IXU) . KXU  converts  keywords  into  their 
internal  representations.  SM  is  primarily  used  to  store,  retrieve  and  update 
the  indices  of  the  database.  Indices  are  maintained  in  SM  as  a directory.  Each 
entry  in  the  directory  consists  of  a keyword  followed  by  a set  of  indices.  An 
index  consists  of  a cylinder  number  (actually,  the  index  contains  some  more 
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information  as  we  shall  see  later)  which  indicates  whare  in  mass  memory 
records  containing  the  keyword  may  be  found.  Any  keyword  that  appears  as 
part  of  a directory  entry  in  SM  is  a directory  keyword.  For  example, 
security  keywords  are  automatically  defined  to  be  directory  keywords  and 
they  always  appear  as  part  of  directory  entries  in  SM.  However,  not  all 
simple  keywords  are  directory  keywords.  Non-directory  keywords  are  mainly 
used  by  MM  and  SFP  for  record  comparison  and  sorting  purposes.  SMIP  is  re- 
sponsible for  performing  set  intersections  on  the  indices  retrieved  by  SM. 

IXU  is  used  to  decode  the  indices  output  by  SMIP.  These  four  components  are 
designed  to  operate  concurrently  in  a pipeline  fashion.  The  hardware  organi- 
zation, details  and  design  philosophy  of  these  components  are  documented  in 
[5]. 

A major  part  of  this  report  will  concentrate  on  the  role  of  two  components  - 
namely,  the  database  command  and  control  processor  (DBCCP)  and  the  security 
filter  processor  (SFP)  - in  security  enforcement  and  data  clustering.  DBCCP 
regulates  the  operations  of  both  the  structure  and  data  loops  and  interfaces 
with  the  front-end  host  computer.  It  processes  all  DBC  commands  received  from 
the  front-end  host  computer,  schedules  their  execution  on  the  basis  of  the 
command  type  and  priority,  and  routes  the  response  to  the  front-end  host  computer. 
Additionally,  it  makes  use  of  SFP  to  search  the  tables  needed  for  the  enforce- 
ment of  security  and  the  clustering  of  records. 

The  use  of  DBCCP  and  SFP  provides  the  advantage  that  all  access  control 
checks  can  be  made  prior  to  the  retrieval  of  records  from  MM.  Consequently, 
access  imprecision  due  to  redundant  record  retrieval  never  occurs.  Thus,  the 
need  to  discard  retrieved  records  due  to  security  violation  does  not  arise 
since  they  are  retrieved  from  MM  only  after  they  are  cleared  for  access. 

1.3  Control  Flow  During  Command  Execution 

Figure  2 shows  how  certain  commands  are  executed  in  DBC.  Basically,  these 
commands  forwarded  from  the  front-end  host  computer  (in  pre-determined  formats) , 
are  recognized  by  DBCCP  as  either  access  commands  or  preparatory  commands. 

Access  commands  are  those  that  require  DBC  to  access  the  mass  memory;  prepara- 
tory commands  precede  and  follow  access  commands  and  convey  important  house- 
keeping information.  Access  commands  go  through  a security  check.  Having 
undergone  security  checks,  access  commands  are  translated  by  DBCCP  into  orders 
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that  can  be  processed  by  the  mass  memory.  This  translation  process  requires 
the  u 2 of  the  components  in  the  structure  loop.  Nevertheless,  for  our  dis- 
cussion we  have  not  shown  them  in  Figure  2.  During  transition,  an  access 
command  involving  insertion  activates  the  clustering  mechanism  in  DBCCP. 

This  clustering  mechanism  determines  the  mass  memory  cylinder  into  which  the 
record  must  be  inserted.  Records  retrieved  by  the  mass  memory  (as  a result 
of  the  execution  of  orders  and  the  supply  of  cylinder  numbers  provided  by 
DBCCP)  are  transmitted  to  the  sorter.  The  sorter  allows  groups  of  retrieved 
records  to  be  sorted  on  the  basis  of  some  attribute,  or  joined  with  other 
groups  of  records  (a  join  being  a relational  equality  join).  The  study  of  the 
sorter,  the  join  operation  and  update  is  included  in  the  next  report.  The  ex- 
clusion of  components  of  the  structure  loop  is  deliberate.  Although  all  four 
of  them  participate  in  the  translation  of  access  commands  of  DBC,  they  do  not 
play  a major  role  in  clustering  and  access  control.  For  this  reason,  we  shall 
overlook  their  presence. 


1.4  Organization  of  this  Report 

In  Section  2 we  shall  discuss  security  and  clustering  related  concepts. 

The  data  structures  needed  for  security  and  clustering  are  presented  in  Section 
3,  while  Section  4 is  devoted  to  a discussion  of  the  algorithms  necessary  for 
access  control  and  data  clustering.  Finally,  the  concepts  and  algorithms  are 
illustrated  by  means  of  an  integrated  example  in  Section  5. 


2.  SECURITY  AND  CLUSTERING  CONCEPTS 
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In  this  section,  we  shall  concentrate  on  the  concepts  that  form  the 
basis  of  the  security  and  clustering  mechanisms  of  DBC.  This  will  lead  us 
to  the  details  of  these  mechanisms  in  the  following  sections. 

2.1  Basis  of  DBC  Security  Mechanism 

To  protect  the  confidentiality  of  data,  two  principal  problems  must  be 
resolved  by  computer  systems.  The  first  is  to  conceal  the  data  in  user-com- 
puter and  computer-computer  communications  using  encryption  [7].  the  second 
is  to  determine  who  can  perform  what  operations  on  which  data.  The  latter  is 
the  problem  of  providing  access  control.  DBC  is  concerned  with  providing  an 
efficient  access  control  mechanism.  Encryption,  if  necessary,  may  be  support- 
ed by  the  use  of  encryption  and  decryption  boxes  at  either  end  of  the  communi- 
cation line  between  DBC  and  its  host. 

Naturally,  it  is  up  to  the  file  creator  to  specify  who  can  perform 
what  operations  on  which  data  for  his  file.  What  we,  as  designers,  would 
like  to  do  is  provide  him  with  an  easy,  yet  efficient,  mechanism  to  convey 
these  three  pieces  of  information  to  DBC.  We  shall  describe  this  mechanism 
of  DBC  in  the  following  paragraphs. 

The  user  can  convey  two  of  the  above  three  pieces  of  information  with- 
out any  difficulty.  It  is  easy  enough  for  the  file  creator  to  be  able  to 
specify  the  names  of  the  users  to  whom  he  wishes  to  give  permission  to  access 
his  file.  Likewise,  the  specification  of  a list  of  the  allowable  accesses 
(examples  of  accesses  being  read-only,  read-and-modify , etc.)  for  each  user 
permitted  to  access  the  file  does  not  provide  any  special  problem.  So,  the 
problem  of  access  control  really  boils  down  to  the  problem  of  how  to  specify 
the  portions  of  a file  to  which  accesses  are  allowed  for  the  different  users. 

Different  database  management  systems  have  different  ways  of  tackling 
the  problem.  Many  avoid  the  problem  altogether  by  allowing  an  access  on  the 
entire  file,  or  not  allowing  the  access  at  all.  This,  as  the  reader  will 
well  appreciate,  is  not  a powerful  enough  mechanism  since  it  is  lacking  in 
two  respects.  We  shall  demonstrate  the  two  weaknesses  of  this  mechanism  by 
means  of  examples.  Consider  the  case  of  a database  consisting  of  employee 
records,  where  all  the  employees  work  at  the  White  House.  Of  course,  one  of 
the  records  contains  information  on  the  president,  Jimmy  Carter.  Naturally, 
the  file  creator  would  not  wish  to  allow  anybody,  other  than  the  president. 
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to  read  that  particular  record  even  though  he  would  like  to  allow  many  users 
to  read  some  of  the  other  records  in  the  file.  However,  the  mechanism  described 
above  does  not  provide  the  file  creator  with  a way  to  allow  ’read'  access  on 
all  records  of  the  file  except  one.  He  is  faced  with  the  unpleasant  task  of  hav- 
ing to  leny  a person  'read'  access  to  the  entire  file,  just  because  that  per- 
son is  not  allowed  to  read  one  record  in  it. 

We  shall  again  utilize  the  database  of  the  above  example  to  illustrate 
the  second  weakness  of  this  mechanism.  Every  record  in  the  above  database 
consists  of,  say,  a Name  field,  a Job  Position  field  and  a Salary  field.  It 
is  very  likely  that  the  file  creator  would  want  to  allow  a particular  user 
'read'  access  to  the  Name  and  Job  Position  fields,  but  not  to  the  Salary  field. 
This  is  because  institutions  are  traditionally  very  sensitive  about  what  they 
pay  their  employees.  If  the  file  creator  were  to  use  the  mechanism  described 
above,  he  would  have  to  deny  users  'read'  access  to  entire  records  just  be- 
cause they  are  not  allowed  to  read  parts  of  these  records. 

A mechanism  used  by  many  database  management  systems  that  overcomes  one 
of  the  weaknesses  described  above  is  called  the  view  mechanism.  In  this  mech- 
anism, a user  is  not  provided  a complete  'picture'  of  the  database.  He  is 
only  provided  with  a window  or  view  of  the  database.  A view  consists  of  a 
subset  of  all  the  attributes  of  the  file.  For  example,  to  overcome  the 
second  weakness  described  above,  the  user  is  provided  with  a view  that  does 
not  contain  the  attribute  Salary.  He,  therefore,  views  an  employee  record 
as  consisting  of  only  a Name  field  and  a Job  Position  field.  Thus,  he  has  no 
way  of  accessing  the  Salary  field,  since,  as  far  as  he  is  concerned,  the  field 
does  not  even  exist.  For  individual  users,  separate  views  may  be  specified 
for  different  subsets  of  the  attributes  of  the  file.  In  DBC,  we  have  a secur- 
ity specification  mechanism  which  is  simpler  than  the  specification  of  views. 

Yet,  it  can  be  used  to  implement  views  with  very  good  performance  since  the 
security  enforcement  mechanism  of  DBC  is  supported  by  hardware. 

Another  method  that  is  sometimes  used  for  security  enforcement  is  called 
query  modification.  A user  query  is  executed  after  it  has  been  modified  with 
a collection  of  previously-specified  conjunctions  of  predicates  that  define 
the  records  accessible  by  the  user.  The  way  that  the  execution  of  the  modi- 
fied query  is  carried  out  causes  some  degree  of  access  imprecision,  i.e., 
many  records  are  actually  retrieved  and  examined  as  a result  of  the  user 
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query,  but  are  later  rejected  (i.e.,  not  passed  on  to  the  user)  since  they 
are  found  to  violate  the  conjunctions  of  predicates  that  define  the  access- 
ible records  . In  DBC,  we  shall  provide  a mechanism  which  first  determines 
a priori  a list  of  record  groups  (or  compartments  known  as  atoms)  that  are 
actually  accessible  by  the  user.  Only  these  record  groups,  and  no  more, 
need  to  be  examined;  thereby  eliminating  access  imprecision.  The  original 
query  (i.e.,  the  given  user  query)  is  now  evaluated  only  on  the  records  be- 
longing to  the  above  record  groups.  Thus,  the  time-consuming  process  of 
checking  each  individual  record  for  access  control  is  also  avoided. 

2.1.1  Subfile  and  Record  Level  Security  - The  Concept  of  Security  Atom 

We  wish  to  incorporate  in  DBC  a security-enforcement  mechanism  that 
will  allow  the  file  creator  to  protect  his  database  at  the  sub-file,  record 
and  attribute  levels  but  which  will  not  be  as  unwieldy  as  the  view  mechanism 
or  as  time-consuming  and  costly  as  the  query-modification  scheme.  To  be  able 
to  refer  to  portions  of  a file,  a file  creator  in  DBC  uses  query  conjunctions. 

A query  conjunction,  we  may  recall,  is  a conjunction  of  predicates.  Query 
conjunctions  and  predicates  used  in  this  manner,  solely  for  partitioning 
the  database  for  access  control,  are  called  security  conjunctions  and  security 
predicates  respectively.  A security  attribute  is  an  attribute  that  appears 
in  a security  predicate.  A keyword  of  a record  that  satisfies  a security 
predicate  is  called  a security  keyword. 

To  demonstrate  how  security  conjunctions  are  used  to  partition  a file 
into  record  aggregates,  we  use  the  same  sample  database  of  the  previous  ex- 
amples. Let  us  assume  that  the  file  creator  specifies  only  one  security 
conjunction,  and  this  is: 

(Salary>15 ,000) . 

In  order  to  keep  the  example  as  simple  as  possible,  the  security  conjunction 
has  been  chosen  to  contain  only  one  security  predicate.  This  could,  potential- 
ly, partition  the  database  into  two  compartments.  The  first  compartment 
contains  all  records,  if  any,  which  contains  a keyword  that  satisfies  the 
above  predicate.  The  second  compartment  c.itains  all  records,  if  any,  which 
do  not  contain  a keyword  that  satisfies  the  predicate. 

Along  with  each  security  conjunction,  the  file  creator  also  specifies, 
for  each  user,  a list  of  accesses  permitted  on  all  records  that  satisfy  the 
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security  conjunction.  This  list  of  accesses  is  called  an  access  descriptor. 

For  each  security  conjunction,  the  file  creator  specifies  p access  descriptors, 
where  p is  the  number  of  users  allowed  to  access  the  file. 

To  take  a more  complex  example,  consider  that  the  file  creator  speci- 
fies two  security  conjunctions  SI  and  S2  as  shown  below: 

SI:  (Salary  15,000)  A (Salary<30,000). 

S2 : (Job  Position= SECRETARY). 

Before  we  complete  the  example,  we  shall  define  what  we  mean  by  a record 
satisfying  a security  conjunction.  A record  satisfies  a security  conjunction 
if  the  record  contains  keywords  that  satisfy  every  security  predicate  in  the 
security  conjunction.  If  a record  satisfies  a security  conjunction,  then 
we  say  that  the  security  conjunction  is  satisfied  by  the  record.  A record 
does  not  satisfy  a security  conjunction  if  the  record  does  not  contain  key- 
words to  satisfy  some  security  predicate  in  the  security  conjunction. 

In  the  example  being  developed,  four  potential  partitions  or  compartments 
may  be  created  for  the  file,  The  first  compartment  contains  records  of  sec- 
retaries earning  between  $15,000  and  $30,000.  The  second  compartment  contains 
records  of  secretaries  earning  less  than  $15,000  or  more  than  $30,000 , 
and  the  third  compartment  contains  records  of  employees  who  are  not  secretaries 
and  who  earn  between  $15,000  and  $30,000,  The  fourth  compartment  contains 
records  of  employees  who  are  not  secretaries  and  who  earn  less  than  $15,000  or 
more  than  $30,000.  More  formally,  the  first  compartment  contains  records  which 
satisfy  SI  and  S2.  The  second  compartment  contains  records  that  satisfy  S2  but 
do  not  satisfy  SI.  The  third  compartment  contains  records  that  satisfy  SI  but 
do  not  satisfy  S2.  The  fourth  compartment  contains  records  that  satisfy  neither 
SI  nor  S2.  Thus,  the  compartment  that  a record  belongs  to  is  determined  by  the 
security  conjunctions  it  satisfies.  It  follows  that  a maximum  of  2n  compartments 
may  be  created  for  a file  with  n security  conjunctions.  However,  in  a real  database, 
the  total  number  of  non-empty  compartments  is  much  smaller  than  this  theoretical 
upper  limit  since  many  compartments  do  not  have  any  records. 

The  compartment  that  a record  of  a file  belongs  to  may  be  determined  by  the 
following  algorithm.  Assume  that  a set  S of  security  conjunctions  has  been 
specified  for  the  file  by  the  file  creator.  Going  through  the  set  S,  we  deter- 
mine for  the  record  the  largest  subset  of  S that  the  record  satisfies.  This 
unique  subset  of  security  conjunctions  defines  the  compartment  to  which  the 
record  belongs.  Notice  that  DBC  carries  out  the  function  of  forming  these 


compartments  using  the  security  conjunctions  specified  by  the  file  creator, 
and  the  file  creator  and  users  need  not  even  be  aware  of  them. 

The  non-empty  compartments  of  a file  created  as  a result  of  the  security 
conjunctions  specified  by  the  file  creator  are  called  security  atoms.  In  the 
aforementioned  example,  there  are  at  most  four  security  atoms.  A security 
atom,  or  simply  an  atom,  defines  a set  of  records  such  that  all  these  records 
satisfy  the  same  unique  set  of  security  conjunctions  (and  not  any  of  the 
other  security  conjunctions  of  the  file);  this  set  of  security  conjunctions 
is  said  to  define  the  security  atom.  As  record  sets,  atoms  have  an  important 
property  - no  two  atoms  have  a record  in  common . The  concept  of  security  atoms 
is  due  to  [8] . 

For  any  user,  the  accesses  permitted  on  a security  atom  are  determined 
as  follows.  The  security  conjunctions  that  define  the  atom  are  first  identi- 
fied. Each  of  these  security  conjunctions  is  associated  with  an  access 
descriptor  for  each  user.  For  all  these  security  conjunctions,  therefore, 
it  is  possible  to  identify  the  access  descriptors  relevant  to  the  given  user. 
The  accesses  permitted  to  the  user  on  the  given  security  atom  are  then  de- 
termined by  intersecting  the  above  access  descriptors. 

In  Figures  3a,  3b,  and  3c  we  have  illustrated  the  concept  by  means  of 
an  example.  Figure  3a  shows  a file  consisting  of  eight  records,  here  each 
record  has  four  keywords.  In  Figure  3b,  we  indicate  the  five  security  con- 
junctions of  the  file  and  the  corresponding  access  descriptors  specified  by 
the  file  creator  for  one  particular  user.  Finally,  in  Figure  3c,  we  show 
the  security  atoms  created  for  the  file  by  DBC  and  corresponding  permitted 
accesses.  Note,  that  only  six  atoms  have  been  created . although  up  to 
(=32)  security  atoms  are  possible.  This  is  because,  typically,  the  number 
of  atoms  of  a file  is  much  less  than  the  number  of  records  in  the  file. 

The  figure  also  shows  the  composition  of  the  atoms  in  terms  of  security 
conjunctions  and  in  terms  of  records.  For  example,  we  see  that  the  record  R1 
belongs  to  the  security  atom  1,  which  is  defined  by  the  security  conjunctions 
SI  and  S4.  To  see  why  this  is  true,  let  us  consider  the  keywords  of  Rl. 

R1  has  a keyword  <Salary,  20,000>.  Consider  this  keyword  and  the  security 
conjunction  SI,  which  is  (Salary>l,000)  A (Salary<20,000) . Since  this  key- 
word of  Rl  satisfies  all  the  security  predicates  in  SI,  Rl  satisfies  SI. 

Rl  has  another  keyword  <Length-Of-Service , 6>.  This  keyword  of  Rl  obviously 
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satisfies  S4  which  has  one  equality  predicate  (Length-Of-Service  =6). 

Hence,  R1  satisfies  SI  and  S4,  and  hence  belongs  to  the  atom  defined  by  these 
two  conjunctions,  namely.  Atom  1 as  depicted  in  Figure  3c.  We  say  that  a 
security  conjunction  is  applicable  to  an  atom  if  the  records  of  the  atom 
satisfy  the  security  conjunction.  In  this  example,  SI  and  S4  are  applicable 
to  Atom  1,  Applicable  conjunctions  are  listed  alongside  of  the  atom  they 
are  applicable  to  in  Figure  3c.  This  example  does  not  show  the  field  or 
attribute  level  security  which  will  also  be  incorporated  as  an  integral 
part  of  DBC  security  mechanism.  For  an  example  that  demonstrates  this  cap- 
ability of  DBC,  the  reader  is  referred  to  Section  2.1,3. 

2.1.2  The  Atom  Access  Privelege  List  (AAPL) 

The  discussion  of  the  previous  subsection  tells  us  that  in  order  to 

determine  if  an  access  is  permitted  on  a record,  it  is  enough  for  DBC  to 

determine  if  the  access  is  permitted  on  the  security  atom  containing  the 
record  since  all  records  in  an  atom  are  protected  similarly.  Thus,  it  is 
necessary  to  first  determine  those  security  conjunctions  which  are  applicable 
to  the  atom  containing  the  record.  As  we  have  stated  earlier,  the  file  cre- 
ator specifies,  with  every  security  conjunction,  the  permitted  accesses  for 
every  user.  The  accesses  of  the  applicable  security  conjunctions  may  then  be 
'anded'  to  determine  if  the  requested  access  on  the  record  is  permitted  for 
a particular  user.  A list  of  atom  identifiers  and  the  corresponding  permit- 
ted accesses  is  called  the  Atomic  Access  Privilege  List  (AAPL).  One  such 
list  exists  for  every  user  of  the  file.  Since  the  number  of  security  atoms 
is  expected  to  be  much  smaller  than  the  number  of  records,  this  list  will  be 

relatively  short,  thus  enabling  DBC  to  scan  it  rapidly  to  determine  whether 

the  requested  access  may  be  granted.  Furthermore,  the  list  is  created  long 
before  the  user  initiates  a request.  This  is  because  DBC  can  create  a list 
of  atoms  in  terms  of  security  conjunctions  during  file  creation  time.  Know- 
ing for  each  atom  the  set  of  security  conjunctions  applicable  to  that  atom, 
it  can  then  'compute'  the  permitted  accesses  on  that  atom.  Thus,  the  process 
of  security  enforcement  has  been  greatly  simplified.  We  note,  finally,  that 
security  enforcement  is  carried  out  by  the  database  command  and  control  pro- 
cessor (DBCCP)  using  the  security  filter  processor  (SFP)  as  and  when  it 
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2.1.3  Field-level  Security 

Earlier,  we  have  stated  that  along  with  each  security  conjunction,  the 
file  creator  specifies  p access  descriptors  where  p is  the  number  of  users 
permitted  to  access  the  file.  That  is,  corresponding  to  each  security  con- 
junction, he  specifies  one  access  descriptor  for  each  user  permitted  to  ac- 
cess the  file.  This  mechanism  is  now  extended  to  support  field-level  secur- 
ity. Along  with  each  security  conjunction,  the  file  creator  specifies,  for 
each  user,  a set  of  pairs.  Each  pair  is  called  a field-level  security  de- 
scriptor. These  pairs  are  of  the  form  (attribute  combination,  access  descrip- 
tor). An  attribute  combination  is  a set  of  attributes.  An  example  of  an 
attribute  combination  is: 

[Name,  Salary]. 

An  example  of  a pair  specified  by  the  file  creator  is: 

([Name,  Salary],  No-Read). 

An  example  of  a security  conjunction  and  a set  of  pairs  is: 

(Salary>100) ([Name,  Salary],  No-Read),  ([Job],  No-Modify). 

The  meaning  of  this  is  as  follows.  The  user  is  not  permitted  to  read  the 

Name  and  Salary  fields  from  records  satisfying  (Salary>100)  although  the 

the  rest  of  the  fields  of  these  records  are  accessible  to  the  user.  Also, 

the  user  is  not  allowed  to  modify  the  Job  field  of  records  satisfying 

(Salary>100) . Notice,  that  the  access  descriptors  always  specify  accesses 

that  are  not  allowed.  Naturally,  if  an  access  is  disallowed  on  an  attribute 

combination  AC.,  then  it  is  disallowed  on  all  attribute  combinations  which  have 
1 

AC^  as  a subset. 

The  mechanism  which  includes  field-level  security  is  explained  by 
means  of  an  example  developed  in  Figures  3d  and  3e.  This  example  pertains 
to  the  same  file  shown  in  Figure  3a.  In  Figure  3d,  we  indicate  the  security 
conjunctions  of  the  file  and  the  corresponding  field-level  security  descrip- 
tors specified  by  the  file  creator  for  one  particular  user.  Figure  3e 
shows  the  security  atoms  created  for  the  file  by  DBC  and  the  corresponding 
field-level  security  descriptors  which  indicate  the  disallowed  accesses  on 
various  attribute  combinations.  The  composition  of  the  atoms  in  terms  of 
security  conjunctions  and  in  terms  of  records  is  also  indicated.  Let  us 
explain  how  the  field-level  security  descriptors  indicating  the  disallowed 
accesses  on  Atom  1,  as  shown  in  Figure  3e,  was  obtained.  Atom  1 is  defined 


Figure  3d,  The  security  conjunctions  and  corresponding  Field-level  security  descriptors  for  a 
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by  security  conjunctions  SI  and  S4.  From  Figure  3d,  we  see  that  the  field- 
level  security  descriptors  corresponding  to  SI  do  not  allow  'read'  access 
on  both  the  Name  and  Salary  attributes.  Naturally,  if  the  'read'  access  is 
disallowed,  then,  so  is  the  'modify'  access.  The  field-level  descriptors 
corresponding  to  S4  allow  the  'read'  access  but  do  not  allow  the  'modify' 
access  on  the  same  two  attributes.  Hence,  the  intersected  set  of  descriptors 
which  indicates  the  disallowed  accesses  on  records  in  Atom  1,  will  disallow 
the  'read'  (and  hence  the  'modify')  access  on  the  Name  and  Salary  fields. 
Similarly,  even  though  the  'read'  access  is  allowed  on  the  Position  attri- 
bute by  the  field-level  security  descriptors  corresponding  to  S4,  it  is 
disallowed  by  those  corresponding  to  SI.  Hence,  the  'read'  access  is  dis- 


allowed on  the  Position  attribute  of  records  in  Atom  1. 


2.2  Data  Clustering  in  DBC 


2.2.1  Motivation  for  Clustering 

The  need  for  placing  data  elements,  which  have  a high  probability  of 
being  retrieved  and  updated  together,  in  close  proximity  in  order  to  enhance 
performance  has  been  the  subject  of  research  [ 14 , 16],  Strategies  for  the 
'optimal'  placement  of  data  are  known  as  clustering  strategies.  Naturally, 
clustering  strategies  are  dependent  on  the  nature  and  use  of  the  database 
for  which  they  are  intended. 

Database  management  systems  usually  provide  clustering  mechanisms  to 
support  clustering  strategies.  These  mechanisms  are  dependent  on  the  tech- 
nology used  to  implement  a database  management  system.  For  example,  in  a 
RAP-like  system  [12,  13],  the  search  and  retrieval  time  is  essentially  constant 
with  respect  to  the  size  of  the  database  as  long  as  the  size  does  not  exceed 
the  memory  capacity  of  the  search  processor.  Therefore,  there  is  no  need  for 
a RAP-like  system  to  have  a clustering  mechanism  if  the  database  can  be  accom- 
modated completely  within  the  memory  capacity  of  the  search  processor.  If, 
however,  the  database  size  is  larger  than  the  memory  capacity  of  the  search 
processor,  than  a clustering  mechanism  is  needed  in  order  to  minimize  the 
number  of  loading  and  unloading  operations  per  user  request.  As  another  ex- 
ample, in  conventional  software-laden  database  management  systems  the  infor- 
mation brought  from  a disk  device  into  high-speed  main  memory  is  contained  in 
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pages  (e.g.,  a page  is  a fraction  of  a disk  track).  In  such  cases,  a soft- 
ware clustering  package  is  used  at  page  creation  time  in  order  to  ensure  that 
these  pages  contain  a high  percentage  of  data  relevant  to  a subsequent  re- 
quest. This  will  enable  the  syf  _m  to  minimize  the  I/O  traffic  and  buffering 
due  to  paging,  thereby  improving  system  performance. 

In  our  design  of  DBC,  the  database  is  stored  on  modified  moving-head 
disk  devices  with  tracks-in-parallel-readout  capability.  In  such  devices, 
the  search  and  retrieval  time  for  data  elements  within  a cylinder  is  a con- 
stant. This  constant  is  the  rotation  time  of  the  disk  device.  However,  the 
arm  movement  time,  which  in  most  instances  is  substantially  greater  than  the 
rotation  time,  varies  roughly  in  proportion  to  the  distance  over  which  the 
access  mechanism  is  moved.  It  is  clear,  therefore,  that  in  order  to  derive 
the  best  possible  performance  from  a modified  moving-head  device,  it  is  im- 
portant that  we  place  the  data  elements  in  a way  that  minimizes  the  number  of 
cylinders  to  be  searched  per  user  request.  Here,  we  describe  the  two-level 
clustering  mechanism  of  DBC  which  is  to  be  used  to  support  the  clustering 
strategies  devised  by  the  users. 

Let  us  motivate  the  concept  of  clustering  and  the  resulting  performance 
improvement  by  means  of  a simple  example  [4]  of  a file  with  four  records  as 
depicted  in  Figure  4a. 

In  Figure  4b  we  have  shown  an  arbitrary  placement  of  records  in  the  two 
cylinders  that  have  been  made  available  to  the  file  F.  Now,  if  a query  for 
retrieval  is  received  in  the  form,  "Retrieve  records  which  satisfy  the  con- 
junction (K1AK3)",  then  DBC  has  to  make  two  cylinder  accesses.  However,  if 
the  records  are  placed  in  the  cylinders  grouped  according  to  the  occurrence 
of  keywords  (i.e.,  Kl,  K2  and  K3)  in  a record,  then  the  resulting  configura- 
tion will  be  as  shown  in  Figure  4c.  Such  a configuration  will  facilitate 
the  retrieval  of  all  records  which  satisfy  the  given  query  conjunction  with 
a single  access  to  the  mass  memory. 

The  above  diccussion  implies  two  things:  First,  the  creator  of  the 
file  should  have  some  ideas  of  the  type  of  queries  that  will  be  made  on  the 
file,  if  he  is  interested  in  taking  advantage  of  possible  performance  gain. 
Second,  the  system  (i.e.,  DBC)  provides  him  with  a mechanism  for  effective- 
ly conveying  that  knowledge  to  DBC.  While  we,  as  system  designers,  cannot 
predict  how  much  knowledge  a file  creator  may  have  of  his  file  usage,  we 
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Figure  4a,  Records  belonging  to  a file 


-CYLINDER  JL 


Ki 


k2 


Ki 


CYLINDER.  2 


h 


K* 


Ki 


Figure  4b:  An  arbitrary  assignment  of  records  to  cylinder. 
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trust  ensure  that  he  is  provided  with  an  easy  yet  powerful  mechanism  to  u- 
tilize  that  knowledge  to  his  best  advantage.  The  mechanism  that  we  have 
adopted  and  shall  describe  here  is  capable  of  being  naturally  integrated 
into  the  query  language  available  to  the  users. 

2.2.2  Clustering  Descriptors  and  Logical  Clusters 

A file  is  associated  (by  the  file  creator)  with  a single  primary  clus- 
tering attribute  and  any  number  of  secondary  clustering  attributes.  The 
latter  attributes  are  specified  as  a list  in  an  order  of  importance . The 
importance  of  a secondary  clustering  attribute  is  defined  to  be  its  rela- 
tive position  in  the  above  list.  Thus,  we  can  talk  of  one  secondary  clus- 
tering attribute  as  being  more  important  than  another  secondary  clustering 
attribute  for  clustering  purposes. 

At  the  time  of  file  creation,  the  file  creator  also  specifies  a set 
of  clustering  descriptors.  These  descriptors  may  be  one  of  three  types: 

Type  A:  The  descriptor  is  a conjunction  of  a less-than-or-equal-to  predi- 
cate and  a greater-than-or-equal-to  predicate,  such  that  the  same 
attribute  appears  in  both  predicates.  An  example  of  a type-A 
descriptor  is  as  follows: 

( (Salary>2 ,000)  A (Salary<10 ,000) ) . 

More  simply,  this  is  written  as  follows: 

(2,000<Salary<10,000) . 

Thus,  the  file  creator  merely  specifies  an  attribute  (i.e..  Salary) 
and  a range  of  values  ($2,000  - $10,000)  for  that  attribute.  We 
term  the  value  to  the  left  of  the  attribute  the  lower  limit  and 
the  value  to  the  right  of  the  attribute  the  upper  limit. 

Type  B:  The  descriptor  is  an  equality  predicate.  An  example  of  a type-B 
descriptor  is: 

(Position=PR0FESS0R) . 

Type  C:  The  descriptor  consists  of  only  an  attribute  name.  Let  us  assume 
that  there  are  n different  keywords  Kl,  K2,  ...,  Kn,  in  the  re- 
cords of  this  file,  with  this  attribute.  Then,  this  type-C  descrip- 
tor is  really  equivalent  to  n type-B  descriptors  Bl,  B2 , ...,  Bn, 
where  Bi  is  the  equality  predicate  satisfied  by  Ki.  In  fact,  this 
type-C  descriptor  will  cause  n different  type-B  descriptors  to  be 
formed.  From  now  on,  we  shall  refer  to  the  type-B  descriptors 
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formed  from  a type-C  descriptor  as  type-C  sub-descriptors. 

The  attribute  that  appears  in  a clustering  descriptor  must  either  be 
the  primary  clustering  attribute  or  one  of  tie  secondary  clustering  attri- 
butes. A clustering  descriptor  is  a primary  (secondary)  clustering  descrip- 
tor if  the  corresponding  attribute  is  a primary  (secondary)  clustering 
attribute. 

The  file  creator  must  observe  certain  rules  in  forming  descriptors. 

These  are  specified  below: 

(1)  Ranges  specified  in  type-A  descriptors  for  a given  attribute  must  be 
mutually  exclusive. 

(2)  For  every  type-B  descriptor  of  the  form  (attribute-1  = value -1 ) , no 
type-A  descriptor  car.  have  the  same  attribute  and  a range  that  con- 
tains its  value  (i.e.,  value-1). 

(3)  An  attribute  that  appears  in  a type-C  descriptor  must  not  also  appear 
in  a type-A  or  a type-B  descriptor. 

A primary  (secondary)  clustering  keyword  is  a keyword  of  a record  such 
that  one  of  the  following  holds: 

(a)  The  attribute  of  the  keyword  is  specified  in  a type-A  primary  (secondary) 
clustering  descriptor  and  the  value  is  within  the  range  of  the  descriptor. 

(b)  The  attribute  and  value  of  the  keyword  match  those  specified  in  a type-B 
primary  (secondary)  clustering  descriptor. 

(c)  The  attribute  of  the  keyword  is  specified  in  a type-C  primary  (secon- 
dary) clustering  descriptor. 

i In  all  these  cases,  the  primary  (secondary)  clustering  keyword  is  said 

to  be  derived  or  derivable  from  the  corresponding  primary  (secondary)  clus- 
tering descriptor.  Each  primary  clustering  descriptor  is  associated  with 
a maximum  space  requirement  (in  terms  of  number  of  cylinders)  which  indi- 
cates the  estimated  amount  of  storage  required  in  the  mass  memory  for  all 
records  having  keywords  derived  from  this  descriptor.  The  estimate  is  only 
an  approximate  one,  but  better  performance  is  obtained  if  the  estimate 
closely  reflects  the  actual  maximum  storage  requirement.  The  importance  of 
a secondary  clustering  keyword  is  defined  to  be  the  same  as  the  importance 
of  the  corresponding  secondary  clustering  attribute  of  the  keyword. 

Every  (primary  clustering  descriptor,  secondary  clustering  descriptor) 
defines  a cluster  of  records.  Each  record  in  this  cluster  must  satisfy  two 
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conditions. 

(1)  The  primary  clustering  keyword  of  the  record  should  be  derivable  from 
the  primary  clustering  descriptor  of  the  cluster. 

(?)  The  most  important  secondary  clustering  keyword  £f  the  record  should 
be  derivable  from  the  secondary  clustering  descriptor  of  the  cluster. 

Note  that  the  most  important  secondary  clustering  keyword  of  one  record 
need  not  be  the  same  as  the  most  important  secondary  clustering  keyword  of 
another  record.  In  fact,  the  most  important  secondary  clustering  attribute 
in  the  keywords  of  one  record  need  not  be  the  same  as  the  most  important 
secondary  clustering  attribute  in  the  keywords  of  another  record.  This  is 
because  the  secondary  clustering  attributes  present  in  the  keywords  of  dif- 
ferent records  may  be  different.  A record  for  insertion  must  have  a pri- 
mary clustering  keyword;  otherwise,  the  record  will  be  rejected  by  DBC. 

The  record  is  also  rejected  if  its  primary  clustering  keyword  is  not  deriv- 
able from  any  primary  clustering  descriptor.  If  a record  has  no  secondary 
clustering  keyword,  then  a null  secondary  clustering  keyword  derivable 
from  a null  secondary  clustering  descriptor  is  assumed.  If  the  most  im- 
portant secondary  clustering  keyword  is  not  derivable  from  any  secondary 
clustering  descriptor,  it  is  assumed  to  be  derived  from  a null  secondary 
clustering  descriptor. 

We  illustrate  these  concepts  by  means  of  an  example  developed  in  Fi- 
gures 5a,  5b,  5c  and  5d.  Figure  5a  shows  a file  consisting  of  six  records, 
where  each  record  has  four  attributes.  As  can  be  seen  from  the  figure,  the 
user  has  specified  the  primary  clustering  attribute  to  be  Job  and  the  sec- 
ondary clustering  attributes  in  order  of  importance  to  be  Salary  and  Depart- 
ment Number.  In  Figures  5b  and  5c,  the  primary  and  secondary  clustering 
descriptors  of  the  file  have  been  shown.  Finally,  in  Figure  5d,  we  illus- 
trate the  various  clusters  formed  for  the  file  and  their  composition  both 
in  terms  of  primary  and  secondary  clustering  descriptors,  and  in  terms  of 
the  records  that  make  up  the  clusters.  For  example,  the  record  R1  belongs  to  the 
cluster  1,  which  is  defined  by  the  primary  clustering  descriptor  PY1  and 
the  secondary  clustering  descriptor  SEC1.  To  see  why  this  is  so,  let  us 
look  at  the  keywords  of  Rl.  Rl  contains  the  keyword  <Job,  MANAGER>  and  this 
is  obviously  derivable  from  the  primary  clustering  descriptor  PY1,  i.e., 

(Job  = MANAGER).  Also,  Rl  contains  the  keyword  <Salary,  1500>  and  this  is 
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Primary  clustering  attribute:  JOB 
Secondary  clustering  attributes  in  order  of 
importance:  SALARY.,  DAPARTTEJfr -NUMBER 
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Figure  5a.  The  six  records  of  a file,  and  its 
primary  and  secondary  clustering 
attributes. 
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Figure  5b.  Primary  clustering  descriptors 
of  File  F. 
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Figure  5c.  Secondary  clustering  descriptors  of 
File  F. 


derivable  from  the  secondary  clustering  descriptor,  SECl,  (i.e.,  lOOO^Sal- 
ary<2000)  since,  1500  lies  between  1000  and  2000.  It  also  contains  the 
keyword  department  Number,  100> , which  is  derivable  from  another  secondary 
clustering  descriptor,  SEC3.  However,  for  clustering  purposes  we  only  con- 
sider the  primary  clustering  descriptor  PY1  and  the  secondary  clustering 
descriptor  SECl  of  R1  from  which  the  primary  clustering  keyword  of  the  re- 
cord (i.e.,  <Job,  MANAGER>)  and  the  most  important  secondary  clustering 
keyword  of  the  record  (i.e.,  <Salary,  1500>)  are  derivable.  Hence,  R1  be- 
longs to  the  cluster  defined  by  PY1  and  SECl. 

It  has  been  mentioned  earlier  that  DBC  can  support  three  kinds  of  data- 
base organizations  - relational,  network,  and  hierarchical.  We  suggest  that 
the  primary  clustering  attribute  for  a relational  database  be  'relation'. 

The  reason  for  clustering  by  relation  is  because  database  queries  on  rela- 
tional databases  will  always  involve  one  or  more  relations.  In  this  manner, 
we  will  always  be  ensured  that  DBC  will  satisfy  any  given  query  involving 
a single  relation  by  accessing,  at  most,  a number  of  cylinders  that  is  no 
greater  thah  the  number  of  cylinders  required  to  store  all  the  records  in 
the  given  relation.  Further  clustering  may  be  done  based  on  the  information 
on  images  and  links  [1,  9]. 

For  a network  (CODASYL)  database,  primary  clustering  may  be  based  on 
the  record  location  modes  [2,10].  Alternatively,  the  primary  clustering 
may  be  based  on  the  record  type  [2,10]  and  the  secondary  clustering  may  be 
based  on  the  record  location  modes. 

For  hierarchical  databases  such  as  the  IMS  databases,  three  different 
clustering  policies  are  suggested  [3].  The  first  policy  clusters  all  DBC 
records  which  represent  segments  belonging  to  the  same  IMS  root  segment 
[11],  The  second  policy  first  clusters  DBC  records  which  represent  all 
the  IMS  root  segments  and  then  clusters  DBC  records  which  represent  all 
dependent  segments.  The  third  policy  clusters  the  records  by  segment  type. 

2 . 3 Introducing  the  Security  and  Clustering  Mechanisms 

We  shall  now  informally  sketch  the  process  of  file  creation  in  DBC 
(the  algorithms  are  formally  presented  in  Section  4),  which  will  also  clar- 
ify the  mechanisms  of  data  clustering  and  access  control. 


The  creator  first  issues  an , Open-database-f ile-for-creation . request. 
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All  requests  to  DBC  consist  of  six  parts: 

(1)  FEC-ID,  which  is  the  identification  number  of  the  front-end  computer 

from  which  the  request  originated, 

(2)  R-ID,  which  is  the  identification  number  of  the  request, 

(3)  R-CODE,  which  is  a coded  version  of  the  request  or  command  name, 

(4)  the  priority  of  the  request, 

(5)  the  identification  number  of  the  user  making  the  request,  and 

(6)  the  argument  set  (queries,  records  etc.) 

A logical  view  of  the  ' Open-database-f ile-for-creation'  request  is 
shown  in  Figure  6.  It  provides  information  on  the  number  of  attributes 
the  file  is  to  have,  the  number  of  mass  memory  cylinders  that  need  to  be 
allocated  for  the  file  initially,  and  the  number  of  cylinders  that  may  be 
allocated  additionally  if  the  initial  allocation  is  insufficient. 

Next,  the  file  creator  issues  the  ' Load-attribute- inf ormation ' request. 
A logical  view  of  the  request  is  shown  in  Figure  7.  This  request  is  used 
to  provide  DBC  with  information  on  the  attributes  used  in  the  file.  This 
information  includes,  for  every  attribute  of  the  file,  the  maximum  and  min- 
imum values  the  attribute  may  take,  whether  the  attribute  is  a primary  clus- 
tering or  secondary  clustering  attribute,  and  the  relative  importance  of  the 
attribute  if  it  is  a secondary  clustering  attribute. 

At  this  stage,  the  file  creator  issues  a 'Load-descriptors'  request, 
the  format  of  which  is  shown  in  Figure  8a.  Each  of  the  descriptors  in  the 
request,  as  Figure  8b  illustrates,  includes  the  identification  number  of 
the  descriptor,  whether  it  is  a primary  clustering  or  secondary  clustering 
descriptor,  the  type  of  the  descriptor  (i.e.,  A,  B or  C) , the  identification 
number  of  the  attribute  corresponding  to  the  descriptor,  and  the  range  of 
values  this  attribute  takes.  Also,  the  maximum  space  requirement  for  pri- 
mary clustering  descriptors  is  indicated. 

The  file  creator  now  loads  all  the  necessary  access  control  informa- 
tion by  means  of  a 'Load-access-control-information'  request,  a logical 
view  of  which  is  shown  in  Figure  9a.  This  request  identifies  the  users 
who  are  allowed  to  access  the  file,  and  assigns  to  each  user  U,  K access 
privileges  Ai  (l£i<K)  corresponding  to  the  K security  conjunctions  Si 
(l^i£K) . An  access  privilege  Ai,  for  user  U,  specifies  the  kinds  of  acces- 
ses the  user  U may  not  make  on  all  records  satisfying  Si.  A view  of  an 
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access  privilege  is  shown  in  Figure  9b.  These  accesses  need  not  be  on  en- 
tire records,  but  may  be  only  on  certain  combinations  of  attributes  in  the 
records.  Thus,  for  example,  the  creator  of  the  file  may  specify  that  user 
U does  not  have  'read'  access  to  the  Job  and  Department  attributes  of  all 
records  satisfying  security  conjunction  Si.  A default  access  descriptor 
is  also  included  as  part  of  this  request.  This  is  defined  as  the  set  of 
accesses  allowed  on  all  records  belonging  to  the  security  atom  to  which  no 
security  conjunction  is  applicable. 

The  file  creator  now  issues  a series  of  'Load-record'  requests.  The 
format  of  this  request  is  shown  in  Figure  10.  As  the  records  are  received 
by  DBC,  two  things  are  determined  for  each  of  them,  namely,  the  security 
atom  and  the  cluster  to  which  it  belongs.  The  cluster  that  the  record  belongs 
to  is  used  in  order  to  determine  where  in  the  mass  memory  this  record  is 
to  be  placed,  and  also  to  create  a list  of  cluster  identifiers  and  their 
descriptions  in  terms  of  the  primary  clustering  descriptor  and  the  most 
important  secondary  clustering  descriptor.  The  security  atom  to  which  the 
record  belongs  is  used  to  create  a list  of  security  atom  identifiers  and 
their  descriptions  in  terms  of  the  security  conjunctions  making  up  the 
atom. 

The  file  creator  now  issues  a 'Close-database-file'  request.  This 
request  has  no  arguments.  At  this  stage,  DBC  does  the  following  for  each 
of  the  P users  allowed  to  access  the  file  that  has  just  been  'closed'  by 
the  creator.  For  each  security  atom,  the  set  of  security  conjunctions  that 
are  applicable  to  that  atom  are  known.  At  this  point,  the  set  of  accesses 
not  permitted  to  each  user  can  be  computed  with  respect  to  each  atom  by 
'anding'  (intersecting)  the  access  descriptors  of  the  security  conjunctions 
applicable  to  the  atom.  Thus,  the  list  of  atoms  and  the  disallowed  accesses  is 
compiled  for  each  of  the  P users  in  an  atomic  access  privilege  list  (AAPL) 

(one  for  each  user) . Since  the  number  of  security  atoms  is  expected  to  be 
much  smaller  than  the  number  of  records,  this  list  will  be  relatively  short. 

When  a user  command  is  received  by  DBC,  the  record  specification  of  the 
command  can  be  either  a query  or  a record  tag.  If  the  specification  is  a 
query,  DBC  infers  the  atoms  referred  to  by  the  query  by  looking  up  auxili- 
ary information  stored  in  the  structure  memory.  The  record  tag,  on  the 
other  hand,  contains  the  name  of  the  security  atom  which  contains  the  record 
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FIGURE  9b.  An  Expanded  View  of  the  Access  Privileges  Field 
in  the  'Load-Access-Control-Information'  Request 
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referenced  by  the  user  request.  By  scanning  the  AAPL  of  the  user  for  se- 
curity atom(s) , DBC  can  either  deny  or  permit  the  access. 
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3.  DATA  ORGANIZATION  IN  THE  STRUCTURE  LOOP 

Data  structures  are  maintained  in  DBC  for  each  file  and  for  the  users 
of  the  files  in  order  to  ensure  that  no  unauthorized  accesses  are  allowed 
on  the  database  stored  in  the  mass  memory  of  DBC.  Also,  data  clustering  is 
achieved  by  means  of  tables  that  maintain  cluster  definitions  and  informa- 
tion on  space  availability  in  the  disk  cylinders  comprising  the  mass  mem- 
ory (MM).  This  section  has  been  divided  into  three  sub-sections.  The 
first  sub-section  deals  with  those  tables  that  are  needed  for  security  en- 
forcement. The  second  sub-section  describes  the  tables  needed  for  data 
clustering.  The  discussion  of  the  relationship  of  the  tables  in  these  two 
sub-sections  with  DBC  components  will  be  included  in  a later  section.  The 
final  sub-section  describes  some  of  the  other  tables  used  by  DBC  to  main- 
tain the  names  of  the  users  and  their  files.  These  tables  will  have  to  be 
referred  to  by  DBC  in  order  to  support  the  mechanisms  of  access  control 
and  data  clustering.  Some  of  the  tables  dealt  with  in  the  last  sub-section 
are  stored  in  the  database  command  and  control  processor  (DBCCP)  and  the 
rest  are  stored  in  the  index  translation  unit  (IXU). 

We  may  recall,  from  Section  1,  that  an  entry  in  the  structure  memory 
(SM)  essentially  consists  of  a keyword  followed  by  a list  of  cylinder  num- 
bers. More  specifically,  an  entry  in  SM  consists  of  a keyword  followed  by 
a list  of  pairs  known  as  index  terms,  where  each  index  term  is  of  the  form 
(cylinder  number,  security  atom  name).  The  cylinder  number  indicates  the 
cylinder  in  MM  in  which  a record  containing  the  keyword  may  be  found,  and  the 
security  atom  name  indicates  the  security  atom  to  which  the  record  belongs.  The 
cluster  information,  on  the  other  hand,  is  not  stored  in  the  structure  memory 
but  is  mostly  stored  in  the  database  connand  and  control  processor  (DBCCP)  and 
the  security  filter  processor  (SFP).  The  keywords,  cylinder  numbers  and 
security  atom  names  are  stored  in  SM  in  a coded  fixed-length  form  in  order 
to  conserve  space  and  facilitate  searching  (it  is  easier  to  process  fixed- 
length  rather  than  variable-length  fields).  The  tables  in  IXU  are  used  to 
convert  the  coded  index  terms  in  SM  to  their  expanded  full-length  form. 

For  a detailed  explanation  of  the  coding  and  decoding  processes,  the  reader 
is  referred  to  [5], 
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3.1  Tables  for  Security  Enforcement 

There  are  three  tables  maintained  by  DBC  in  order  to  enforce  access 
control.  There  is  one  security  atom  definition  table  (SADT)  for  each  file 
which  is  needed  for  maintaining  the  definitions  of  security  atoms  in  terms 
of  security  conjunctions.  There  is  a security  information  table  (SIT) 
for  each  user-file  combination  known  to  DBC.  This  table  provides  a list 
of  security  conjunctions  and  corresponding  access  privileges  for  each  user 
of  a file  as  indicated  by  the  file  creator.  Finally,  an  atomic  access 
privilege  list  (AAPL)  is  created  by  DBC  for  each  user-file  combination  in 
order  to  keep  track  of  the  access  privileges  allowed  to  the  user  with  re- 
spect to  each  atom  of  the  file. 


A.  The  Security  Atom  Definition  Table  (SADT) 

The  security  atoms  of  a file  are  defined  in  a security  atom  definition 
table  (SADT).  Hence,  there  is  an  SADT  for  each  file.  A view  of  this  table 
is  shown  in  Figure  11.  There  is  an  entry  in  the  SADT  for  each  atom  of  a 
file  and  each  entry  has  two  fields.  The  first  field  is  a number  that  iden- 
tifies a security  atom  and  the  second  field  is  a list  of  security  conjunc- 
tion identifiers  that  define  the  security  atom.  Every  security  conjunction 
specified  by  a file  creator  is  assigned  an  identifier  by  DBC.  We  may  re- 
call, from  the  previous  section,  that  a security  atom  consists  of  records 
which  satisfy  the  same  unique  set  of  security  conjunctions.  These  security 
conjunctions  define  the  security  atom.  An  example  of  a SADT  would  be  the 
one  below. 

Security  Atom  Id  Security  Conjunction  Ids 

1 S1,S5 

2 S3,S4,S7 


Another  way  of  representing  the  same  information  is  as  shown  below. 
Security  Atom  Id  Security  Conjunction  Id 


1 

1 

2 

2 

2 


SI 

S5 

53 

54 
S7 


Either  representation  may  be  used  in  DBC. 


B.  The  Security  Information  Table  (SIT) 

There  is  one  such  table  for  every  user  of  a file.  Thus,  if  there 
are  n files  in  the  database  FI,  F2 , F3 Fn  and  the  file  Fi  is  access- 

ible by  Ui  users,  then  the  total  number  of  SITs  maintained  in  DBC  is  U1  + 

U2  + ...  + Un.  A logical  view  of  this  table  is  shown  in  Figure  12.  The 
first  field  consists  of  a security  conjunction.  The  second  field  consists 
of  a set  of  field-level  security  descriptors  of  the  form  (attribute  com- 
bination, access  descriptor).  The  access  descriptor  always  specifies  access- 
es which  are  not  allowed  on  the  corresponding  attribute  combination.  Thus, 
an  access  descriptor  specifies  a set  of  disallowed  accesses.  It  is  seen, 
for  example,  in  Figure  12  that  read  requests  on  both  the  Name  and  Position 
fields  of  records  satisfying  the  security  conjunction  (Salary  20,000)  A 
(Position3 PROFESSOR)  will  be  denied  by  DBC. 

This  table  may  actually  be  implemented  in  the  manner  indicated  in  Fi- 
gure 13.  The  implementation  uses  one  access  template  per  file.  This  access 
template  consists  of  a set  of  pairs  of  the  form  (combination  identifier: 
attribute  combination).  In  other  words,  each  attribute  combination  intend- 
ed for  protection  is  assigned  a combination  identifier.  The  second  field 
of  the  SIT  is  also  a set  of  pairs  of  the  form  (combination  identifier:  access 
descriptor).  Thus,  only  identifiers  and  not  the  actual  attribute  combina- 
tions themselves  need  to  be  stored  in  the  SITs.  Similarly,  the  first  field 
in  the  SIT  need  not  contain  a security  conjunction.  It  only  needs  to  con- 
tain a security  conjunction  identifier. 

C.  The  Atomic  Access  Privilege  List  (AAPL) 

There  is  one  such  list  for  every  user-file  pair.  A diagram  of  the 
AAPL  is  shown  in  Figure  14.  The  first  field  contains  a security  atom  iden- 
tifier, while  the  second  field  consists  of  a set  of  field-level  security 
descriptors.  This  field  is  similar  to  the  second  field  of  the  SIT. 

The  following  steps  are  taken  by  DBC  to  determine  if  access  ai  is 
permitted  on  attribute  combination  ACi  of  a record  in  security  atom  SA. 

(1)  Look  up  the  AAPL  for  the  entry  whose  first  field  contains  the  identi- 
fier corresponding  to  security  atom  SA.  Thus,  form  the  set  S of  all 
attribute  combinations  in  the  second  field  of  this  entry  that  are 
subsets  of  ACi. 

(2)  Take  the  union  of  the  access  descriptors  corresponding  to  every  member 
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of  S.  The  resultant  access'descriptor  AD  gives  the  list  of  disallowed 
accesses  on  ACi. 

(3)  Scan  AD  to  see  if  ai  is  disallowed.  If  not,  the  access  ai  is  permit- 
ted on  the  combination  ACi. 

Consider  that  we  are  given  the  AAPL  of  Figure  14.  We  are  required  to 
determine  if  the  'read'  access  is  allowed  on  the  combination  of  the  Name, 
Salary  and  Length-Of-Service  fields  of  records  in  Atom  1.  Scanning  through 
the  second  field  of  the  first  entry  in  the  AAPL,  we  determine  that  the  com- 
bination [Name,  Salary]  is  the  only  combination  which  is  a subset  of  the 
given  combination  [Name,  Salary,  Length-Of-Service].  The  access  descriptor 
corresponding  to  [Name,  Salary]  is  'No-Modify'.  Hence,  it  allows  'read' 
access.  Thus,  'read'  access  is  permitted  on  the  combination  of  the  Name, 
Salary  and  Length-Of-Service  fields  of  records  in  Atom  1. 

3.2  Tables  for  Data  Clustering 

For  the  purpose  of  data  clustering,  information  is  needed  in  DBC  about 
space  availability  in  the  mass  memory  and  about  the  definition  of  the  clus- 
ters. The  space  available  in  each  cylinder  allocated  to  a file  is  displayed 
in  a cylinder  space  table  (CST).  The  primary  and  secondary  clustering  de- 
scriptors are  defined  in  a primary  clustering  descrip  tor  table  (PCDT)  and  a 
secondary  clustering  descriptor  table  (SCDT),  respectively.  The  definition 
of  each  cluster  of  a file,  in  terms  of  primary  and  secondary  clustering  de- 
scriptors, is  carried  in  a cluster  identifier  definition  table  (CIDT). 

A.  The  Cylinder  Space  Table  (CST) 

A logical  view  of  this  table  is  shown  in  Figure  15.  There  is  one  such 
table  for  every  file  known  to  DBC.  There  are  as  many  entries  in  the  CST  as 
there  are  cylinders  allocated  to  the  file.  Each  entry  consists  of  two  fields 
The  first  field  contains  a cylinder  number  and  the  second  field  contains 
information  concerning  the  unused  space  in  that  cylinder. 

B.  The  Cluster  Identifier  Definition  Table  (CIDT) 

There  is  one  such  table  for  every  file  known  to  DBC.  Each  entry  in  the 
table  is  a quadruple  as  shown  in  Figure  16.  The  first  field  contains  a 
cluster  identifier.  The  second  and  third  fields  contain  the  identity  of  the 
primary  clustering  descriptor  and  the  secondary  clustering  descriptor,  respec 


Figure  16:  The  Cluster  Identifier  Definition  Table  (CIDT) 
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tively,  that  make  up  this  cluster.  The  fourth  field  contains  the  cylinder 
number  of  a cylinder  that  contains  at  least  one  record  belonging  to  this 
cluster. 

C.  The  Primary  Clustering  Descriptor  Table  (PCDT) 

The  logical  view  of  one  such  table  is  shown  in  Figure  17.  There  is 
one  primary  clustering  descriptor  table  (PCDT)  for  each  file  known  to  DBC. 
There  is  an  entry  in  the  PCDT  for  every  primary  clustering  descriptor  and 
each  entry  consists  of  five  fields  as  shown  in  the  figure.  The  first  field 
contains  the  identifier  of  the  attribute  in  the  primary  clustering  descrip- 
tor. This  identifier  is  assigned  by  a front-end  computer.  In  case  the  pri- 
mary clustering  descriptor  is  of  type-A,  then  the  second  and  third  fields 
contain  the  upper  and  lower  limits,  respectively,  of  the  range  that  defines 
the  descriptor.  If  the  primary  clustering  descriptor  or  sub-descriptor  is 
of  type-B  or  type-C,  then  the  second  field  is  null  (indicated  by  some  spe- 
cial symbol  such  as  a $ sign)  and  the  third  field  stores  the  value  of  the 
descriptor  or  sub-descriptor.  The  second  and  third  fields  always  contain 
values  in  their  original  form  rather  than  the  coded  form  used  to  store  them 
in  SM.  The  fourth  field  contains  the  identifier  assigned  to  the  descriptor 
by  DBC.  The  fifth  field  contains  the  maximum  space  requirement  for  the 
clustering  descriptor.  This  maximum  space  requirement  is  the  estimated 
number  of  bytes  needed  for  storing  all  records  whose  primary  clustering 
keywords  are  derivable  from  the  clustering  descriptor.  Naturally,  the  file 
creator  has  to  make  this  estimation. 

D.  The  Secondary  Clustering  Descriptor  Table  (SCDT) 

This  table  is  similar  to  the  PCDT.  However,  it  does  not  have  a fifth 
field  since  no  maximum  space  requirement  is  specified  for  secondary  cluster- 
ing descriptors.  Also,  there  is  now  one  entry  per  secondary  clustering 
descriptor  instead  of  one  entry  per  primary  clustering  descriptor. 


3. 3 Other  Table  Structures 
3.3.1  Tables  in  DBCCP 


concerning  some  of  the  other  tables. 


A.  The  File  Table  (FT) 

There  is  only  one  file  table  in  DBCCP  and  it  has  as  many  entries  as 
there  are  files  known  to  DBC.  The  FT  is  stored  in  some  global  location 
which  is  known  to  all  components  of  DBC.  Figure  18  provides  a diagram  of 
this  table.  Each  entry  of  the  FT  consists  of  two  fields.  The  first  is  a 
file  identifier  and  this  is  a numeric  encoding  of  the  file  name.  The  second 
field  is  a pointer  to  the  file  information  table  (FIT)  which  will  be  de- 
scribed in  the  following  sub-section. 

B.  The  File  Information  Table  (FIT) 

A logical  view  of  the  file  information  table  (FIT)  is  shown  in  Figure 
19.  There  is  one  such  table  for  every  file  known  to  DBC.  Logically,  it 
consists  of  16  separate  fields  defined  as  follows. 

1)  This  field  contains  the  identity  of  the  file  creator. 

2)  This  two  bit  field  indicates  the  status  of  the  file.  A file  may  be 
in  one  of  three  possible  states. 

a)  It  may  be  inactive  as  indicated  by  00  in  this  field. 

b)  It  may  be  open  for  creation  as  indicated  by  01  in  this  field. 

In  this  state,  the  file  is  being  created,  but  nobody  has 

begun  to  access  it  yet. 

c)  It  may  be  open  for  access  as  indicated  by  10  in  this  field. 

In  this  state,  the  file  is  being  accessed  by  some  user. 

3)  This  field  really  consists  of  n subfields,  where  n is  the  number  of 
attributes  defined  for  the  file  by  the  file  creator.  We  may  recall 
that  n is  specified  in  the  ' Open-database-f ile-f or-creation'  request 
(see  Figure  6).  The  n attribute  identifiers  assigned  to  this  file 
by  the  front-end  computer  are  stored  in  the  n subfields. 

4)  This  field  contains  the  number  of  mass  memory  cylinders  needed  to 
store  the  file  as  indicated  by  the  file  creator. 

5)  This  field  contains  the  number  of  additional  mass  memory  cylinders 
that  may  be  required  to  store  the  file  in  case  the  initial  allocation 
as  specified  in  Field  4 is  not  enough.  The  information  in  Fields  4 
and  5 come  directly  from  Arguments  2 and  3 of  the  'Open-database-file- 
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Id  of  Creator 
File  Status 

00  File  not  active 

01  File  open  for  creation 
10  File  open  for  access 

N Attribute  ids  of  the  file 

# OF  cylinders  allocated  to  the  file 

# OF  additional  cylinders  to  be  allocated 
Pointer  to  the  CAT  in  the  IXU 

Pointer  to  the  CBM  in  the  IXU 

Pointer  to  the  SANBM  in  the  IXU 

Pointer  to  the  CIBM  in  the  IXU 

P User  ids  of  users  allowed  to  access  this  file 

Pointer  to  the  SADT 

Pointer  to  the  PCDT 

Pointer  to  the  SCDT 

Pointer  to  the  CIDT 

Pointer  to  the  CST 

File  Identifier 


Figure  19:  The  File  Information  Table  (FIT) 


54  - 


for-creation'  request  (see  Figure  6). 

6)  This  field  contains  a pointer  to  the  Cylinder  Address  Table  (CAT) 
which  is  stored  in  the  secondary  memory  of  the  index  translation 
unit  (IXU)  and  which  is  described  in  the  last  sub-section.  The  CAT 
is  used  to  extract  absolute  cylinder  numbers  from  the  index  terms 
retrieved  from  the  structure  memory.  There  is  one  CAT  per  file. 

7)  This  field  contains  a pointer  to  the  Cylinder  Bit  Map  (CBM)  which  is 
also  stored  in  the  secondary  memory  of  the  IXU.  There  is  one  CBM 
(described  in  the  last  sub-section)  per  file.  The  CBM  is  used  to 
convert  absolute  cylinder  numbers  to  relative  cylinder  numbers  before 
storing  them  in  the  structure  memory  (SM). 

8)  This  field  contains  a pointer  to  the  Security  Atom  Name  Bit  Map 
(SANBM)  stored  in  the  secondary  memory  of  the  IXU.  There  is  one 
SANBM  (described  in  the  last  sub-section)  per  file.  The  SANBM  is 
used  to  allocate  and  deallocate  security  atom  identifiers. 

9)  This  field  contains  a pointer  to  the  Cluster  Identifier  Bit  Map  (CIBM) 
stored  in  the  secondary  memory  of  the  IXU.  There  is  one  CIBM  (des- 
cribed in  the  last  sub-section)  per  file  and  it  is  used  to  allocate 
and  deallocate  cluster  identifiers  for  that  file. 

10)  This  field  consists  of  p subfields,  where  p is  the  number  of  users 

that  are  allowed  to  access  the  file.  The  value  of  p is  extracted  from 
the  'Load-access-control- information'  (see  Figure  9)  request  described 
in  Section  2.  The  p user  identifiers  are  stored  in  the  p subfields. 


11) 
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C.  The  User  Table  (UT) 

There  is  only  one  user  table  in  DBC  and  it  contains  one  entry  for 
every  user  known  to  DBC.  The  UT  is  stored  in  some  global  location  which  is 


known  to  all  the  components  of  DBC.  A logical  view  of  the  UT  is  shown  in 
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Figure  20.  Each  entry  consists  of  two  fields.  The  first  field  contains 
the  identifier  of  the  user,  and  the  second  field  is  a pointer  to  his  user 
file  table  (UFT)  described  in  the  next  sub-section. 

D.  The  User  File  Table  (UFT) 

A logical  view  of  the  UFT  is  shown  in  Figure  21.  There  is  one  user 
file  table  for  each  user  known  to  DBC.  The  UFT  is  a table  of  varying  length 
and  consists  of  as  many  entries  as  there  are  files  the  user  is  allowed  to 
access.  Each  entry  contains  four  fields  as  described  below.  The  first 
field  contains  a file  identifier.  The  second  field  contains  a pointer  to 
the  SIT  for  this  user  corresponding  to  this  file.  The  third  field  contains 
a pointer  to  the  AAPL  for  this  user  corresponding  to  this  file.  A null 
entry  in  the  third  field  indicates  that  the  AAPL  for  this  user-file  com- 
bination has  not  yet  been  created  by  DBC.  The  fourth  field  is  a one  bit 
field.  A '1'  in  this  field  indicates  that  the  user  has  opened  this  file 
for  access. 

E.  The  Overall  Cylinder  Bit  Map  (OCBM) 

There  is  only  one  overall  cylinder  bit  map  in  DBC  and  it  is  stored 
in  a known  global  location.  The  OCBM  has  one  bit  corresponding  to  each 
cylinder  in  the  mass  memory.  If  the  cylinder  is  available  for  allocation, 
then  the  corresponding  bit  in  the  OCBM  contains  logic  0.  Otherwise,  the 
bit  in  the  OCBM  contains  logic  1.  DBC  uses  the  OCBM  to  allocate  cylinders 
to  files. 

3.3.2  Table  Structures  in  I XU 

Each  file  known  to  DBC  has  a cylinder  address  table  (CAT)  maintained 
by  IXU.  A CAT  has  up  to  256  entries,  which  is  the  upper  limit  on  the  num- 
ber of  cylinders  that  a file  may  occupy  (this  is  a decision  based  on  exper- 
ience with  file  sizes).  There  is  an  entry  in  a CAT  for  each  cylinder  allo- 
cated to  a file.  The  CAT  is  used  for  transforming  relative  cylinder  numbers 
(found  in  the  index  terms  stored  in  the  structure  memory)  into  absolute  cyl- 
inder numbers.  The  index  terms  contain  only  relative  cylinder  numbers.  A 
relative  cylinder  number  is  used  as  an  index  into  the  CAT  to  retrieve  the 
corresponding  absolute  cylinder  number  from  it.  Thus,  if  the  relative 
cylinder  number  is  3,  the  third  entry  in  the  CAT  contains  the  corresponding 


J 


ONE  BIT 


actual  cylinder  number. 

Corresponding  to  each  CAT  there  are  three  bit  maps,  the  Cylinder  Bit 
Map  (CBM),  the  Security  Atom  Name  Bit  Map  (SANBM) , and  the  Cluster  Identi- 
fier Bit  Map  (CIBM).  These  are  used  to  keep  track  of  the  allocation  and 
release  of  cylinders,  security  atom  names,  and  cluster  identifiers,  respec- 
tively. For  a detailed  description  of  the  CAT,  the  CBM,  the  SANBM  and  the 
CIBM,  the  reader  is  referred  to  [5], 
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4.  ALGORITHMS  FOR  SUPPORTING  ACCESS  CONTROL  AND  CLUSTERING 


In  this  section,  we  present  algorithms  that  are  built  in  the  hardware 
of  DBC  in  order  to  support  the  functions  of  access  control  and  clustering. 
Some  of  these  algorithms  are  executed  at  database  creation  time,  and  others 
are  executed  at  database  access  time.  It  is  our  belief  that  a better  un- 
derstanding of  these  algorithms  can  be  obtained  if  we  discuss  all  the  hard- 
ware algorithms  at  database  creation  time,  even  if  they  are  not  directly 
related  to  access  control  and  clustering.  Accordingly,  we  divide  this 
section  into  two  parts,  one  discussing  the  algorithms  for  file  creation,  and 
the  other  discussing  the  algorithms  for  file  access. 


4. 1 Algorithms  for  Database  Creation 

In  this  sub-section,  we  present  several  algorithms,  one  for  each  of  the 
preparatory  commands  mentioned  in  Section  2.  The  reader  is  advised  to  refer 
back  to  the  last  sub-section  of  Section  2,  in  order  to  look  at  the  nature  of 
these  requests  again.  The  algorithms  provide  a good  understanding  of  how 
the  various  tables  and  lists,  described  in  Section  3,  are  built  up. 

ALGORITHM  A:  To  process  the  command,  open-database-file-for-creat ion. 

Step  1:  Using  the  overall  cylinder  bit  map  (OCBM) , determine  if 
the  number  of  cylinders  n requested  for  initial  alloca- 
tion to  the  file  are  available.  That  is,  check  if  there 
are  at  least  as  many  bits  with  logic  0 in  the  OCBM  as 
there  are  cylinders  requested  for  initial  allocation. 

If  not,  send  a reject  signal  to  the  front-end  computer, 
and  go  to  Step  4. 

Step  2:  Allocate  a file  identifier  for  the  new  file.  Allocate 

space  in  the  database  command  and  control  processor  (DBCCP) 
for  a file  information  table  (FIT)  for  this  file,  and  make 
an  additional  entry  in  the  file  table  (FT).  Fields  1,  2, 

3,  4,  5 and  6 of  the  FIT  are  filled  up  at  this  time. 

Field  1 is  filled  with  the  identity  of  the  file  creator. 
Field  2 is  made  '01'  to  indicate  that  the  file  is  open 
for  creation.  Information  for  Fields  3,  4 and  5 is  ob- 
tained from  Arguments  1,  2,  and  3 of  this  request  (see 
Figure  6).  Field  16  is  filled  with  the  identifier  assigned 
to  the  file  by  the  front-end  computer.  This  information 
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is  also  present  in  the  request.  A cylinder  space  table 
(CST)  is  now  created  for  this  file.  At  this  stage,  the 
CST  indicates  that  all  n cylinders  of  the  file  are  empty. 
Field  15  of  the  FIT  is  now  filled  with  a pointer  to  the 
CST.  Now,  a cylinder  address  table  (CAT) , a cylinder  bit 
map  (CBM),  a cluster  information  bit  map  (CIBM) , and  a 
security  atom  name  bit  map  (SANBM)  are  created  for  this 
file  and  loaded  into  IXU.  The  CAT  contains  the  absolute 
cylinder  numbers  of  the  n cylinders  allocated  in  Step  1, 
in  its  first  n locations.  The  first  n bits  of  the  CBM 
are  each  filled  with  logic  1 to  indicate  that  n cylinders 
have  been  allocated  for  the  file.  Every  bit  of  the  CIBM 
and  the  SANBM  is  filled  with  logic  0 at  the  present  time. 

If  IXU  does  not  have  enough  space  to  hold  the  CAT,  the 
CIBM,  the  SANBM  and  the  CBM,  then  send  a reject  signal  to 
the  front-end  computer,  and  release  FIT,  FT  and  CST  space 
allocated  to  the  file.  Also,  release  the  cylinders  allo- 
cated to  the  file.  If  the  IXU  does  have  enough  space, 
then  fill  up  fields  6,  7,  8,  and  9 of  the  FIT.  A check 
is  made  to  see  if  the  user  table  (UT)  has  an  entry  corres- 
ponding to  the  file  creator.  If  so,  then  follow  the  pointer 
in  the  entry  to  the  user's  user  file  table  (UFT)  and  go  to 
Step  3.  If  the  UT  does  not  have  an  entry  corresponding  to 
this  user,  allocate  space  for  an  UFT  for  the  user,  and  then 
make  an  entry  in  the  UT  for  this  user.  Part  of  this  entry 
is  a pointer  to  the  newly  created  UFT. 

Step  3:  Make  a new  entry  in  the  UFT  for  this  file.  In  the  first 
field  of  the  UFT  entry,  enter  the  file  identifier  of  the 
file  being  created.  The  other  fields  are  left  empty  at 
this  stage. 

Step  4:  Release  space  occupied  by  the  command,  and  terminate. 


ALGORITHM  B:  To  process  the  command,  load-attribute- information. 

Step  1:  Check,  from  the  second  field  of  the  FIT,  if  the  file  is 

open  for  creation.  If  the  file  is  not  open  for  creation, 
send  a reject  signal  to  the  front-end  computer  and  go  to 
Step  5. 


- 60  - 


Step  2:  Load  the  hash  algorithms  of  the  file  into  the  KXU  secon- 
dary memory. 

Step  3:  For  each  attribute  of  the  file,  load  the  attribute  infor- 
mation into  the  KXU  secondary  memory. 

Step  4:  If  the  KXU  has  rejected  any  of  the  above  requests  because 
of  lack  of  space,  then  send  a reject  signal  to  the  front- 
end  computer  and  release  FIT,  FT,  UFT,  CST,  CAT,  CBM, 

SANBM,  CIBM,  and  UT  space  allocated  to  the  file  in  algo- 
rithm A.  Also,  release  the  cylinders  allocated  to  the  file. 

Step  5:  Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 


ALGORITHM  C: 
Step  1: 

Step  2: 


Step  3: 


Step  5: 


To  process  the  command,  load-descriptors. 

Check,  from  the  FIT,  if  the  file  is  open  for  creation.  If 
it  is  not  open  for  creation,  then  send  a reject  signal  to 
the  front-end  computer  and  go  to  Step  6. 

Allocate  space  for  a primary  clustering  descriptor  table 
(PCDT)  and  a secondary  clustering  descriptor  table  (SCDT). 

Fill  up  Fields  12  and  13  of  the  FIT  with  pointers  to  these 
two  tables,  respectively. 

If  the  descriptor  is  a primary  clustering  descriptor,  then 
an  entry  is  made  for  it  in  the  PCDT.  Otherwise,  if  the 
descriptor  is  a secondary  clustering  descriptor,  then  an 
entry  is  made  for  it  in  the  SCDT.  If  the  descriptor  is 
of  type-C,  then  Fields  2 and  3 of  its  entry  are  made  null 
(say,  represented  by  a $ symbol).  If  the  descriptor  is 
of  type-B,  then  Field  2 is  null,  and  Field  3 is  filled 
with  the  value  specified  in  the  descriptor.  If  the  descriptor 
is  of  type-A,  then  the  second  and  third  fields  are  filled 
with  the  lower  and  upper  limits,  respectively,  of  the  Tenge 
that  defines  the  descriptor.  If  the  descriptor  is  of  type-C, 
then  Field  4 is  left  empty  at  this  time.  Else,  allocate  a 
new  descriptor  identifier  and  place  it  in  Field  4 of  the  entry. 
Invoke  Algorithm C*  to  form  clustering  keyword  descriptors 
in  the  structure  memory  (SM).  We  may  recall  that  an  entry 
in  SM  consists  of  a keyword  followed  by  a set  of  index 
terms.  We  wish  to  modify  this  so  that  an  entry  consists  of 
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Step  6: 

ALGORITHM  C' : 

Step  1: 

Step  2: 

Step  3: 

Step  4: 

Step  3: 

ALGORITHM  I): 
Step  1: 

Step  2: 


a keyword  descriptor  followed  by  a set  of  index  terms.  A 
keyword  descriptor  is  a type-A  descriptor  and  consists  of 
an  attribute,  a lower  limit  and  an  upper  limit.  That  is, 
it  consists  of  an  attribute  and  a range  of  values.  To 
find  information  regarding  any  keyword  Ki  in  SM,  it  is 
only  necessary  to  search  the  entry  corresponding  to  key- 
word descriptor  KDi,  where  Ki  is  derivable  from  KDi.  The 
advantage  of  using  keyword  descriptors  is  that  less  space 
is  needed  in  SM  than  if  keywords  were  used. 

Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 

To  form  clustering  keyword  descriptors  in  the  keyword 
descriptor  table  (iUT;  . 

Using  the  file  identifier  passed  as  argument  to  this 
algorithm,  search  the  FT  and  locate  the  entry  for  this 
file  in  it.  Follow  the  pointer  in  the  entry  to  the  FIT. 
Follow  the  pointer  in  Field  12  of  the  FIT  to  the  PCDT  for 
the  file.  For  each  type-A  descriptor  in  the  PCDT  (as  indi- 
cated by  non-null  values  in  Fields  2 and  3 of  the  entry), 
do  Step  3. 

Form  a keyword  descriptor  as  shown  in  Figure  22.  Store 
this  descriptor  in  a temporary  table  called  the  keyword 
descriptor  tab le  ( KDT ) . 

Follow  the  pointer  in  Field  13  of  the  FIT  to  the  SGDT 
for  this  file.  For  each  type-A  descriptor  in  the  SCOT, 
do  Step  3. 

Te  rminate. 

To  process  the  command,  load-access-control- information. 
Check,  from  the  FIT,  if  the  file  is  open  for  creation.  If 
it  is  not  open  for  creation,  then  send  a reject  signal  to 
the  front-end  computer  and  go  to  Step  7. 

Space  is  allocated  in  DBCCP  for  p SITs,  where  p is  the 
number  of  users  allowed  to  access  the  file.  The  value  of 
p is  part  ol  tnis  command  (see  Figure  9a-  Each  SIT  will 
have  k entries,  where  k is  the  number  of  security  conjunc- 
tions specified  by  the  file  creator.  All  the  p SITs  may 
now  be  filled  up  completely,  using  the  information  provided 
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by  the  user  in  the  request.  (See  Figure  9a.) 

Step  3:  In  the  p r.ubfields  of  Field  10  of  the  FIT,  store  the  p user 
identifiers,  of  the  p users  allowed  to  access  the  file. 

The  p user  identifiers  are  part  of  the  information  provid- 
ed by  the  user  with  this  request. 

Step  4:  For  each  of  the  p users,  do  Step  5. 

Step  5:  Check  if  he  already  has  an  entry  in  the  UT.  If  not,  create 
an  entry  for  him  in  the  UT,  and  also  allocate  space  for  an 
UFT  for  him.  Now  make  an  additional  entry  in  the  UFT. 

The  first  field  of  this  entry  contains  the  file  identifier. 
The  second  field  of  this  entry  contains  a pointer  to  the 
SIT  just  formed  for  this  user.  The  third  field  is  null, 
indicating  that  the  AAPL  for  this  user-file  combination 
has  not  yet  been  created.  The  fourth  field  i‘  made  '0'. 

Step  6:  Invoke  Algorithm  D'  to  form  keyword  descriptors 
in  SM. 

Step  7:  Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 

ALGORITHM  D' : To  form  keyword  descriptors  in  the  structure 
memory  (SM). 

Step  1:  Find  the  entry  for  the  user  in  the  UT.  Follow  the  pointer 
in  the  UT  to  his  UFT.  Search  the  UFT  for  the  entry  corres- 
ponding to  this  file.  Thus,  locate  the  SIT  for  this  user- 
file  combination. 

Step  2:  Look  at  the  first  field  of  every  entry  in  the  SIT.  This 
field  contains  security  conjunctions  which  are  made  up  of 
security  predicates.  We  recall  that  each  predicate  is  a 
triple  of  the  form  (attribute,  relational  operator,  value). 
For  the  purposes  of  this  algorithm,  we  shall  only  concentrate 
on  the  attribute  and  value  parts  of  a security  predicate. 
Also,  we  shall  restrict  ourselves  to  those  security  predi- 
cates with  numeric  value  parts.  For  every  unique  attribute 
present  in  these  security  predicates  and  in  Field  1 of  the 
KDT  formed  in  Algorithm  C' , do  Step  3. 
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Step  3:  Make  an  entry  for  this  attribute  in  a temporary  table  called 
the  descriptor  formation  table  (DFT).  Each  entry  in  this 
table  consists  of  two  fields.  The  first  field  contains  the 
attribute  identifier.  The  second  field  contains  a list  of 
values.  These  values  are  those  associated  with  this  attri- 
bute in  the  various  security  predicates  of  the  file  and  in 
the  entries  in  the  KDT. 

Step  4:  For  each  entry  in  the  DFT,  do  the  following.  Remove  all 

duplicates  that  are  present  among  the  values  in  the  second 
field  of  this  entry.  Now,  sort  the  remaining  values  in  the 
second  field  of  this  entry  into  ascending  order. 

Step  5:  Let  each  entry  be  of  the  form: 

Attribute  ID  VI,  V2,  ...,  Vn. 

For  each  entry,  form  n+1  security  keyword  descriptors,  where 
each  descriptor  is  of  the  form  shown  in  Figure  22. 

The  first  descriptor  is: 

Attribute  ID  -<■  VI 

The  second  descriptor  is: 

Attribute  ID  VI  V2 

The  third  descriptor  is: 

Attribute  ID  V2  V3 

The  n+lth  descriptor  is: 

Attribute  ID  Vn  “ 

Enter  these  descriptors  as  new  entries  in  SM.  The  index 
terms  for  these  keyword  descriptors  will  be  formed  as  and 
when  records  containing  keywords  derivable  from  these  de- 
scriptors are  placed  in  the  mass  memory  (MM)  of  DBC. 

Step  6:  Delete  the  DFT  and  KDT  and  terminate. 

We  illustrate  this  algorithm  in  Figures  23,  24,  25  and  26.  In  Figure  23, 
we  show  the  four  security  conjunctions  specified  by  a file  creator.  Figure  24 
shows  the  descriptor  formation  table  (DFT).  Note  that  the  attribute  Position 
has  no  entry  in  the  DFT,  since  the  values  associated  with  it  are  non-numeric. 

In  Figure  25,  we  show  the  same  table  after  removal  of  duplicates  from  the  val- 
ues in  all  second  fields  and  after  sorting  the  values  in  all  second  fields  into 
ascending  order.  Finally,  in  Figure  26,  we  show  the  security  keyword  descrip- 
tors formed  as  a result  of  the  security  conjunctions  specified  by  the  user. 
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1,  (Salary  > 10,000)  a (Weight  < 150) 

2,  (Position  = PROFESSOR)  a (Salary  < 2,000) 

3,  (Weight  = 130)  a (Salary  = 12,000) 

4,  (Salary  = 10,000) 

Figure  23:  Security  conjunctions  of  a file, 


Attribute  ID 

List  of  associated  values 

Salary 

Weight 

10000,  2000,  12000,  10000 

150,  130 

Figure  24:  A view  of  the  descriptor 

FORMATION  TABLE  (DR), 


Attribute  ID 

List  of  associated  values 

Salary 

Weight 

2000,  10000,  12000 

130,  150 

Figure  25:  A view  of  the  DR,  after  elimi 

NATION  OF  DUPLICATES  AND 
SORTING. 
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ALGORITHM  E:  To  process  the  command,  load-record. 

Step  1:  Check,  from  the  FIT,  if  the  file  is  open  for  creation.  If 
the  file  is  not  open  for  creation,  then  send  a reject  sig- 
nal to  the  front-end  computer  and  go  to  Step  8. 

Step  2:  Invoke  Algorithm  F to  determine  the  security  atom  SA  to 

which  the  record  belongs.  If  the  algorithm  returns  a neg- 
ative result,  then  send  a reject  signal  to  the  front-end 
computer  and  go  to  Step  8. 

Step  3:  Invoke  Algorithm  G1  to  determine  the  cluster  to  which  the 

record  belongs.  If  the  algorithm  returns  a negative  result, 
then  send  a reject  signal  to  the  front-end  computer  and  go 
to  Step  8. 

Step  4:  Invoke  Algorithm  G2  to  determine  the  cylinder,  in  mass  mem- 
ory, in  which  the  record  is  to  be  placed.  If  the  algorithm 
returns  a negative  result,  then  send  a reject  signal  to  the 
front-end  computer  and  go  to  Step  8. 

Step  5:  Insert  the  record  into  the  cylinder  selected  in  Step  4. 

Step  6:  For  each  directory  keyword  in  the  record,  do  Step  7.  We 
may  recall  that  part  of  the  information  provided  in  the 
keyword  indicates  if  it  is  a directory  keyword  or  not. 

Step  7:  Transform  the  keyword  using  the  KXU.  Using  the  CBM  stored 
in  the  IXU,  convert  the  absolute  cylinder  number  obtained 
in  Step  4 into  a relative  cylinder  number.  Store  the  ab- 
solute cylinder  number  in  the  CAT.  Make  a new  index  term, 
where  the  index  term  is  defined  as  given  below.  The  first 
field  contains  the  relative  cylinder  number,  and  the  second 
field  contains  the  security  atom  identifier  SA  obtained  from 
Step  2.  Search  SM  to  see  if  the  transformed  keyword  is  deri- 
vable from  any  of  the  keyword  descriptors  or  is  one  of  the 
keywords  in  the  entries  of  SM.  If  so,  then  enter  the  index 
term  under  this  keyword  or  keyword  descriptor.  If  not,  then 
enter  the  index  term  under  this  new  transformed  keyword  in  SM. 

Step  8:  Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 


ALGORITHM  F:  To  determine  the  security  atom  number  for  a record  for  in- 
sertion. 
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Step  1: 
Step  2: 

Step  3: 

Step  4: 
Step  5: 

Step  6: 

Step  7: 

Step  8: 


Let  the  record  for  insertion  have  k keywords.  Set  i=l, 
p=l,  j=l,  n-1. 

Look  at  the  SIT  for  this  user-file  combination.  Let  there 
be  s entries  in  the  SIT,  indicating  s security  conjunctions. 
Let  the  ith  security  conjunction  have  Pi  predicates.  Re- 
call that  a security  conjunction  is  a conjunction  of  secur- 
ity predicates. 

Look  at  the  first  field  of  the  ith  entry  in  the  SIT.  This 
field  contains  the  ith  security  conjunction.  Look  at  the 
pth  predicate  of  this  security  conjunction  and  see  if  the 
jth  keyword  of  the  record  satisfies  this  predicate.  If  it 
does,  then  go  to  Step  5.  Else,  go  to  Step  4. 
j=j+l.  If  j>k,  then  go  to  Step  6.  Else,  go  to  Step  3. 
p=p+l.  If  p>Pi,  then  go  to  Step  7.  Else,  set  j=l.  Go  to 
Step  3. 

[The  record  does  not  satisfy  the  ith.  security  conjunction.]. 
i=i+l.  If  i>s,  then  go  to  Step  8.  Else,  set  j=l,  set  p=l, 
go  to  Step  3. 

All  the  predicates  of  the  ith  security  conjunction  have  been 
satisfied.  LIST  [n]=i.  n=n+l.  i=i+l.  If  i>s,  then  go  to 
Step  8.  Else,  set  j=l,  set  p=l,  go  to  Step  3. 

[LIST[1]  through  LIST  [n-1]  contain  the  set  of  security 
conjunction  identifiers  that  describe  the  security  atom  to 
which  the  record  to  be  inserted  belongs.  If  n=l,  then  it 
signifies  that  no  security  conjunction  defines  the  atom  to 
which  the  record  belongs,  and  this  is  security  atom  0. ] If  a 
security  atom  definition  table  (SADT)  already  exists  for  this 
file,  then  search  it,  looking  for  an  entry  that  contains 
LIST[1]  through  LIST  [n-1]  in  Field  2.  If  a matching  entry 
is  found,  then  return  the  atom  identifier  in  Field  1 of  the 
matching  entry  and  terminate.  If  no  such  entry  exists,  or 
if  the  SADT  itself  does  not  exist,  allocate  a new  security 
atom  name  for  the  file  using  the  SANBM  in  the  IXU.  If  a 
new  identifier  cannot  be  allocated,  return  a negative  sig- 
nal and  terminate.  Else,  make  a new  entry  in  the  SADT. 

The  two  fields  of  this  entry  are  defined  below.  Field  1 
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will  contain  the  atom  identifier  just  assigned,  and  Field  2 
will  contain  LIST  [1]  through  LIST  [n-1].  If  the  SADT  has 
Just  been  allocated,  then  enter  a pointer  to  the  SADT  in 
Field  11  of  the  SIT. 


ALGORITHM  Gl: 


Step  1: 


Step  2: 


Step  3: 


Step  4: 
Step  5: 


To  determine  the  cluster  which  should  contain  a record 
for  insertion. 

Let  the  record  for  insertion  have  k keywords.  Let  the 
attribute  identifier  of  the  ith  keyword  be  called  Ai  and 
the  value  of  the  ith  keyword  be  Vi.  Set  i=l,  FLAG=0, 
PRIORITY=0,  LIST[1]=0,  LIST[2]=0,  FLAG1=0 , FLAG2=0, 

MAX=0.  [FLAG  is  used  to  indicate  whether  the  primary 
clustering  keyword  for  the  record  has  been  found.  FLAG1 
indicates  when  a new  cluster  identifier  has  to  be  assigned 
to  the  file.  LIST[1]  and  LIST[2]  will  contain,  eventually, 
the  primary  clustering  descriptor  identifier  and  the  secondary 
clustering  descriptor  identifier  from  which  the  primary 
clustering  keyword  of  the  record  and  the  most  important 
secondary  clustering  keyword  of  the  record  are  derived.] 

Search  the  first  field  of  the  PCDT  for  this  file,  looking 
for  a match  with  Ai.  Call  the  set  of  entries  in  the  PCDT 
with  a matching  attribute  PC.  Let  there  be  n elements  in 
PC.  If  n is  zero,  then  go  to  Step  3.  Else,  go  to  Step  4. 
[Here,  we  are  checking  to  see  if  the  keyword  is  the  primary 
clustering  keyword. ] 

Search  the  first  field  of  the  SCDT  for  this  file,  looking 
for  a match  with  Ai.  Call  the  set  of  entries  in  the  SCDT 
with  a matching  attribute  SC.  Let  there  be  m elements  in 
SC.  If  m is  zero,  then  go  to  Step  18.  Else,  go  to  Step  11. 
[Here,  we  are  checking  to  see  if  the  keyword  is  a secondary 
clustering  keyword. ] 

j=l.  [In  Step  5,  we  examine  the  jth  element  of  PC.] 

Examine  the  second  field  of  the  jth  matching  entry  in  the 
PCDT.  If  the  second  field  is  null,  and  the  third  field  is 
not,  then  compare  Vi  with  the  third  field.  If  neither  the 
second  nor  the  third  field  is  null,  then  compare  Vi  with 
the  second  and  third  fields,  and  check  to  see  if  Vi  lies 
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Step  6: 
Step  7: 

Step  8: 

Step  9: 
Step  10: 


Step  11: 

Step  12: 
Step  13: 


in  between  the  values  specified  in  the  second  and  third 
fields.  If  the  comparison  is  successful,  then  go  to  Step 
7.  If  the  comparison  is  unsuccessful,  then  go  to  Step  6. 

If  both  the  second  and  third  fields  are  null,  then  set  FLAG2 
=1.  Set  MAX=value  in  fifth  field,  and  go  to  Step  6. 

[Check  the  next  element  of  PC].  j=j+l.  If  j£n,  then  go  to 
Step  5.  Else,  go  to  Step  8. 

Store  the  fourth  field  of  the  matching  entry  in  LIST[1], 

FLAG2=0.  MAX=0.  FLAG=1.  Go  to  Step  9.  [Primary  clustering 
descriptor  describing  the  primary  clustering  keyword  of  the 
record  has  now  been  found.] 

If  FLAG2=0,  then  go  to  Step  9.  Else,  go  to  Step  10.  [Here, 
we  are  checking  to  see  if  there  is  a type-C  descriptor  with 
this  attribute.  If  there  is,  we  need  to  allocate  a r.cw  type- 
C sub-descriptor.  ] 

i=i+l.  If  i>k,  then  go  to  Step  18.  Else,  if  FLAG=0,  then  go 
to  Step  2.  If  i<k  and  FLAG=1,  go  to  Step  3. 

Allocate  a new  primary  clustering  descriptor  identifier  for 
the  file.  Make  a new  entry  in  the  PCDT  for  this  file,  where 
the  four  fields  of  this  new  entry  are  described  below. 

Field  1 will  contain  Ai.  Field  2 will  be  null.  Field  3 will 
contain  Vi,  and  Field  4 will  contain  the  new  primary  clus- 
tering descriptor  identifier  just  allocated  for  the  file.  Field 
5 will  contain  MAX.  This  new  identifier  will  also  be  stored 
in  LIST[1] . FLAG2-=0.  MAX=0.  FLAG-1.  Go  to  Step  9.  [Primary 

clustering  descriptor  describing  the  primary  clustering  key- 
word has  been  found.] 

Compare  PRIORITY  with  the  relative  importance  of  the  attri- 
bute Ai  as  stored  in  the  KXU.  We  may  recall  that  the  rela- 
tive importance  of  a secondary  clustering  attribute  is  part 
of  the  information  contained  in  a 'Load-attribute- informat  ion* 
request  (see  Figure  7).  If  PRIORITY  is  less  than  the  rela- 
tive importance  of  Ai,  then  go  to  Step  12.  Else,  go  to  Step  9. 
j=l.  [In  Step  13,  we  examine  the  jth  matching  entry  in  the 
SCDT. ] 

Examine  the  second  field  of  the  jth  element  of  SC.  If  the 
second  field  is  null,  and  the  third  field  is  not,  then 
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Step  14: 
Step  15: 


Step  16: 


Step  17: 


Step  18: 


compare  Vi  with  the  third  field.  If  neither  the  second 
nor  the  third  field  is  null,  then  compare  Vi  with  the  se- 
cond and  third  fields,  and  check  to  see  if  Vi  lies  in 
between  the  values  specified  in  the  second  and  third  fields. 

If  the  comparison  is  successful,  then  go  to  Step  15.  If 
the  comparison  is  unsuccessful,  then  go  to  Step  14.  If 
both  the  second  and  third  fields  are  null,  then  set  FLAG2 
=1  and  go  to  Step  14. 

j=j+l.  If  j<m,  then  go  to  Step  13.  Else,  go  to  Step  16. 

Store  the  fourth  field  of  the  matching  entry  in  LIST[2], 

FLAG2=0.  PRIORITY=Relative  importance  of  A as  found  from 
KXU.  Go  to  Step  9.  [A  secondary  clustering  descriptor 
describing  a secondary  clustering  keyword  has  been  found.] 

Go  to  Step  9.  [A  secondary  clustering  descriptor  describing 
a secondary  clustering  keyword  has  been  found. ] 

If  FLAG2  = 0,  then  go  to  Step  9.  Else,  go  to  Step  17. 

[Here,  we  check  to  see  if  a type-C  descriptor  with  attri- 
bute Ai  is  present.  If  it  is,  we  need  to  allocate  a new 
type-C  sub-descriptor. ] 

Allocate  a new  secondary  clustering  descriptor  identifier 
for  the  file.  Make  a new  entry  in  the  SCDT  for  this  file, 
where  the  four  fields  of  this  new  entry  are  described  below. 

Field  1 will  contain  Ai.  Field  2 will  be  null.  Field  3 
will  contain  Vi,  and  Field  4 will  contain  the  new  secondary 
clustering  descriptor  identifier  just  allocated  for  the 
file.  The  new  secondary  clustering  descriptor  identifier 
will  also  be  stored  in  LIST[2].  FLAG2=0.  PRIORITY=Relative 
importance  of  Ai  as  found  from  KXb . Go  to  Step  9. 

[LIST[1]  and  LIST [2]  contain  the  primary  clustering  descrip- 
tor and  the  secondary  clustering  descriptor,  respectively, 
that  describe  the  cluster  to  which  the  record  to  be  insert- 
ed belongs.  If  FLAG=0,  then  it  indicates  that  no  primary 
clustering  descriptor  that  may  be  used  for  clustering  has  been 
found.  In  this  case,  we  return  a negative  signal  and  terminate.] 
If  a cluster  identifier  definition  table  (CIDT)  exists, 
then,  search  the  CIDT  of  this  file  to  determine  if  an 
entry  for  a cluster  which  may  be  described  by  the  primary 
clustering  descriptor  identifier  in  LIST[1]  and  the  secon- 
dary clustering  descriptor  identifier  in  LIST[2]  exists. 


If  such  an  entry  does  exist,  then  set  CLUSTER  to  be  equal 
to  Field  1 of  this  entry,  and  terminate.  Otherwise,  set 
FLAGl=l,  and  allocate  a new  cluster  identifier  for  the 
file  using  the  CIBM  described  in  the  previous  section. 

If  a new  cluster  identifier  cannot  be  allocated,  return 
a negative  signal  and  terminate.  Else,  set  CLUSTER  to  be 
equal  to  the  new  identifier  just  allocated. 

Step  19:  Return  CLUSTER  and  terminate. 

ALGORITHM  G2 : To  select  a cylinder  into  which  a given  record  may  be  in- 
serted. 

Step  01:  If  FLAGl=l,  go  to  Step  10.  Else,  go  to  Step  2. 

Step  02:  Search  the  CIDT,  looking  for  all  entries  whose  second 

field  contains  LIST[1]  and  whose  third  field  contains  LIST[2 
Extract  the  fourth  field  from  all  such  entries  and  put  them 
in  a set  w.  For  every  unique  member  of  w,  do  Step  i. 

Step  03:  Search  the  CST  of  this  file,  looking  for  an  entry  whose 

first  field  matches  the  element  in  w.  Place  the  matching 
CST  entry  (which  is  a pair)  into  a set  h. 

Step  04:  Every  element  of  h is  a pair.  Find  that  element  of  h whose 
second  part  is  the  largest  of  all  elements  in  h.  If  more 
than  one  such  element  exists,  then  choose  one  out  of  these 
arbitrarily.  Let  this  be  the  element  (a,  b). 

Step  05:  Compare  b with  the  length  of  the  record  (lr)  to  be  inserted. 

We  may  recall,  that  the  length  of  the  record  is  part  of  the 
information  contained  in  the  'Load-record'  request  (see 
Figure  10).  If  b>_lr,  then  change  the  CST  entry  from 
(a,  b)  to  (a,  b-lr),  set  CYLINDER  = a,  and  go  to  Step  12. 
Else,  go  to  Step  6. 

Step  06:  Search  the  CIDT,  looking  for  all  entries  whose  second  field 
contains  LIST[1].  Extract  the  fourth  field  from  all  such 
entries  and  put  them  together  in  a set  w'.  Compare  the 
number  of  unique  elements  in  w'  (:w':),  with  the  estimated 
number  of  cylinders  br  needed  for  storing  all  records  whose 
primary  clustering  keyword  is  derivable  from  the  primary 
clustering  descriptor  in  LISTfl],  This  number  may  be 


Step  04: 
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found  from  Che  fifth  field  of  the  entry  in  the  PCDT,  whose 
fourth  field  matches  LIST[1].  If  :w' : < br,  then  go  to 
Step  10.  Else,  for  every  unique  member  of  w' , do  Step  7. 

Step  07:  Search  the  CST  of  this  file,  looking  for  an  entry  whose 

first  field  matches  the  element  in  w'.  Place  the  matching 
CST  entry  (which  is  a pair)  into  a set  h' . 

Step  08:  Find  that  element  of  h' , whose  second  part  is  the  largest 
of  all  elements  in  h'.  If  more  than  one  such  element 
exists,  then  choose  one  of  them  arbitrarily.  Let  this  be 
the  element  (a',  b'). 

Step  09:  Compare  b'  with  lr.  If  b'>lr,  then  change  the  CST  entry 

from  (a',  b')  to  (a1,  b'-lr),  set  CYLINDER  = a',  and  go  to 
Step  11.  Else,  go  to  Step  10. 

Step  10:  Search  the  CST  of  the  file,  looking  for  the  entry  with  the 
largest  number  of  bytes  in  the  second  field.  If  more  than 
one  such  entry  exists,  then  choose  one  of  them  arbitrarily. 

Let  this  be  the  entry  (a",  b").  Compare  b"  with  lr.  If 
b">lr,  then  change  the  CST  entry  from  (a",  b")  to  (a",  b"-lr) , 
set  CYLINDER  = a",  and  go  to  Step  11.  Else,  terminate  with 
a negative  signal. 

Step  11:  Make  a new  entry  in  the  CIDT,  where  the  four  fields  of 

this  entry  are  defined  as  follows.  Field  1 will  contain 
the  value  in  CLUSTER.  Field  2 will  contain  the  primary 
clustering  descriptor  identifier  in  LIST[1],  Field  3 will 
contain  the  secondary  clustering  descriptor  identifier  in 
LIST [ 2 ] . Field  4 will  contain  the  value  in  CYLINDER. 

Step  12:  Return  CYLINDER  and  terminate.  [Note:  This  algorithm  makes 
use  of  CLUSTER  and  FLAG1  from  the  previous  algorithm.  ] 

ALGORITHM  H:  To  process  the  command,  close-database-file. 

Step  01:  Check  to  see  if  the  file  is  open,  whether  for  access  or 

for  creation.  If  it  is  already  inactive,  then  send  a reject 
signal  to  the  front-end  computer  and  go  to  Step  3. 

Step  02:  If  the  file  is  open  for  access,  then  look  at  the  entry  for 


this  file  in  the  user's  UFT  and  change  the  fourth  field  to 
logic  '0'.  Also,  make  the  necessary  changes  in  the  second 
field  of  the  FIT  to  indicate  that  the  file  is  inactive. 


Step  03:  Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 

4.2  Algorithms  for  database  access 

This  sub-section  deals  with  the  two  most  common  kinds  of  requests  that 
are  made  upon  any  database.  These  are  the  'Insert-record'  and  the  'retrieve- 
by-query'  requests.  The  algorithm  executed  upon  receipt  of  a 'delete-file' 
request  is  also  included  in  order  to  emphasize  that  this  command  is  different 
from  the  'close-database-file'  command. 


ALGORITHM 

Step 


I: 

1: 


Step  2 : 


Step  3: 


Step  4: 


Step  5: 
Step  6: 

ALGORITHM  J: 

Step  1: 


To  process  the  command,  open-database-file-for-access. 
Check  if  the  file  is  open  for  creation.  If  so,  then  send 
a reject  signal  to  the  front-end  computer  and  go  to 
Step  6. 

Check  to  see  if  there  is  an  entry  for  the  user  in  the  UT. 
If  not,  then  send  a reject  signal  to  the  front-end  com- 
puter, and  go  to  Step  6.  If  he  does  have  an  entry,  then 

follow  the  pointer  in  the  UT  entry  to  his  UFT. 

Search  the  UFT,  looking  for  the  entry  corresponding  to 

this  file.  The  fourth  field  of  the  UFT  is  made  '1'  to 

indicate  that  the  user  has  opened  this  file  for  access. 

If  the  third  field  of  this  entry  is  null,  then  go  to  Step 
4.  Else,  go  to  Step  5. 

For  each  security  atom  found  in  the  SADT,  invoke  Algorithm 
J to  determine  the  access  privileges  accorded  to  the  se- 
curity atom.  Thus,  form  the  AAPL  for  this  user-file  com- 
bination. Change  Field  3 of  the  UFT  entry  to  point  to 
the  AAPL  just  formed. 

Change  Field  2 of  the  FIT  to  indicate  that  the  file  is 
open  for  access. 

Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 

To  determine  the  access  privileges  accorded  to  a security 
atom. 

Search  the  SADT,  looking  for  an  entry  whose  first  field 
contains  the  given  security  atom  identifier  sa.  Form  the 
set  SD,  consisting  of  all  security  conjunction  identifiers 
which  define  the  argument  security  atom  and  which  are 
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stored  in  Field  2 of  the  matching  entry.  Also,  form  a 
field-level  security  descriptor  which  allows  all  accesses 
on  all  attribute  combinations  and  store  this  in  AD. 

Step  2:  If  SD  is  non-empty,  then,  for  every  element  sd  in  SD,  do 

Step  3.  Else,  set  AD=AD  A Default  access  descriptor  and  go 
to  Step  4.  We  recall  that  the  default  access  descriptor  is 
part  of  the  'Load-access-control-information'  command  (see 
Figure  9a) . 

Step  3:  Going  through  the  SIT  of  this  user-file  combination,  find 
that  entry  that  has  sd  in  the  first  field.  Let  Ai  be  the 
corresponding  field-level  security  descriptors  in  the  se- 
cond field  of  this  entry.  Set  AD=AD  A Ai.  [Notice  that 
both  AD  and  Ai  are  sets  of  field-level  security  descriptors. 
We  are  required  to  intersect  these  two  sets  and  ccme  up 
with  a new  set  of  security  descriptors.  We  proceed  in  the 
following  manner.  We  form  the  set  S consisting  of  every 
field-level  security  descriptor  that  appears  in  either  of 
the  two  sets.  Now,  for  each  field-level  security  descriptor 
Di  in  S,  we  do  the  following.  Recall  that  a field-level 
security  descriptor  consists  of  an  attribute  combination 
part  ACi  and  an  access  descriptor  part  ADi.  Going  through 
the  remaining  field-level  security  descriptors  in  S we  come 
up  with  the  set  P of  field-level  security  descriptors  each 
of  whose  attribute  combination  part  is  a subset  of  ACi. 

Now,  we  look  through  the  access  descriptor  part  of  every 
member  of  P to  see  if  there  is  at  least  one  element  in  P 
which  has  an  access  descriptor  part  that  disallows  all  ac- 
cesses disallowed  by  ADi.  [Recall  that  an  access  descriptor 
is  a list  of  disallowed  accesses.].  If  at  least  one  such 
element  exists,  then  delete  Di  from  S.  After  this  process 
has  been  repeated  for  every  field-level  security  descriptor 
in  S,  the  remaining  set  S is  the  required  intersected  set. ] 

Step  4:  Make  a new  entry  in  the  AAPL.  The  two  fields  of  this  new 

entry  are  defined  below.  Field  1 will  contain  the  security 
atom  identifier  sa.  Field  2 will  contain  the  set  of  field- 
level  security  descriptors  AD.  Terminate. 

ALGORITHM  K:  To  process  the  delete-file  command. 

Step  1:  Gheck  the  FIT  to  see  if  the  file  is  open  for  access.  If  so. 
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Step  2 : 

Step  3: 


Step  4: 
Step  5: 
Step  6: 

Step  7: 

Step  8: 
Step  9: 


then  send  a reject  signal  to  the  front-end  computer  and 
to  Step  9.  [A  delete-file  command  may  be  issued  only  f< 
lowing  a close-database-file  request. ] 

Check  to  see  if  an  entry  exists  for  the  user  in  the  UT. 
not,  then  send  a reject  signal  to  the  front-end  computei 
and  go  to  Step  9.  If  he  does  have  an  entry,  then  follov 
the  pointer  in  the  UT  entry  to  his  UFT. 

Search  the  UFT,  looking  for  the  entry  corresponding  to  t 
file.  Using  this  entry,  search  the  AAPL  for  this  user-f 
combination  to  determine  if  the  delete-file  access  is  al 
[Note:  accesses  such  as  delete-file,  which  are  applicab 
the  file  level,  and  not  at  the  security  atom  level,  may 
checked  for  by  looking  at  the  access  privileges  correspo 
ing  to  any  one  security  atom  and  any  one  attribute  combi 
tion.  The  delete-file  access  will  either  be  uniform1}' 
allowed,  or  uniformly  declined  for  every  (security  atom, 
attribute  combination)  pair.]  If  it  is  not  allowed,  the 
send  a reject  signal  to  the  front-end  computer,  and  go  t 
Step  9.  If  the  delete-file  privilege  is  allowed,  then  d< 
the  AAPL  and  the  entry  in  the  UFT.  If  it  happens  to  be 
only  entry  in  the  UFT,  then  delete  the  UT  of  this  user. 
Delete  attribute  information  in  the  IXU  for  each  attribu 
identifier  allocated  to  the  file  as  indicated  in  the  FIT 
Release  all  cylinders  and  all  primary  clustering  and  sec 
dary  clustering  identifiers  allocated  to  the  file. 
Release  space  occupied  by  the  SADT,  all  the  SIT's  of  thi: 
fil«  , all  the  AAPL's  of  this  file,  the  CST.  the  CIDT,  thi 
PCDT , the  SCDT,  the  CAT,  the  CBM,  the  CIBM,  the  SANBM,  ai 
the  only  entry  in  the  FT.  For  every  user  allowed  to  ace 
the  file  as  indicated  in  Field  10  of  the  FIT,  do  Step  7. 
Delete  the  entry  corresponding  to  this  file  from  his  UFT 
If  this  entry  is  the  only  one  in  the  UFT,  then  also  dele 
the  UT  entry  for  this  user. 

Release  the  space  occupied  by  the  FIT. 

Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 


ALGORITHM  L:  To  process  the  insert-record  command. 

Step  1:  Follow  the  UT  entry  of  this  user  to  his  UFT.  If  there  i 


» 


Step  2: 


Step  3: 


Step  4: 


Step  5 
Step  6 


Step  7 


Step  8: 


no  entry  for  him  in  the  UT,  then  send  a reject  signal  to 
the  front-end  computer,  and  go  to  Step  8.  Else,  look  at 
that  entry  in  the  UFT  corresponding  to  this  file.  If  the 
fourth  field  of  this  entry  is  not  1,  then  send  a reject 
signal  to  the  front-end  computer,  and  go  to  Step  8.  [Note: 
In  this  step,  we  check  to  see  if  the  user  has  opened  this 


file  for  access. 


Invoke  Algorithm  F to  determine  the  security  atom  SA  to 
which  the  record  belongs.  If  the  algorithm  returns  a neg- 
ative result,  then  send  a reject  signal  to  the  front-end 
computer,  and  go  to  Step  8. 

Look  up  the  AAPL  of  this  user-file  combination,  to  see  if 
the  insert  access  is  permitted  on  all  the  attributes  of 


records  in  atom  SA.  That  is,  check  if  the  insert  access 


is  denied  on  any  attribute  combination.  If  it  is  denied, 
then  send  a reject  signal  to  the  front-end  computer,  and 
go  to  Step  8. 

Invoke  Algorithm  G1  to  determine  the  cluster  to  which  the 
record  belongs,  and  Algorithm  G2  to  determine  the  cylinder 
in  which  the  record  is  to  be  placed.  If  either  algorithm  re- 
turns a negative  result,  then  send  a reject  signal  to  the 
front-end  computer,  and  go  to  Step  8. 

Insert  the  record  into  the  cylinder  selected  in  Step  4. 

For  each  directory  keyword  in  the  record,  do  Step  7. 


Transform  the  keyword  using  the  KXU.  Using  the  CBM  stored 


in  the  IXU,  convert  the  absolute  cylinder  number  obtained 
in  Step  4 into  a relative  cylinder  number.  Store  the  abso- 
lute cylinder  number  in  the  CAT.  Make  a new  index  term, 
where  the  entry  is  as  defined  below.  The  first  field  con- 
tains the  relative  cylinder  number,  and  the  second  field 
contains  the  security  atom  identifier  SA  obtained  in  Step  2. 
Search  SM  to  see  if  the  transformed  keyword  is  derivable 


from  any  of  the  keyword  descriptors  or  is  one  of  the  keywords 


in  the  entries  of  SM.  If  so,  then  enter  the  index  term  under 


this  keyword  or  keyword  descriptor.  If  not,  then  enter  the 
index  term  under  this  new  transformed  keyword  in  SM. 
Deallocate  space  for  the  command,  and  delete  the  command 


from  DBCCP. 
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ALGORITHM  M: 
Step  1: 


Step  2: 


Step  3: 


S tep  A : 


To  process  the  command,  retrieve-by-query. 

Follow  the  UT  entry  of  the  user  to  his  UFT.  If  there  is 
no  entry  for  him  in  the  UT,  then  send  a reject  signal  to 
the  front-end  computer,  and  go  to  Step  8.  Look  at  the 
entry  in  the  UFT  corresponding  to  this  file.  If  the  fourth 
field  of  this  entry  is  not  1,  then  send  a reject  signal  to 
the  front-end  computer,  and  go  to  Step  8.  [Note:  In  this 
step,  we  check  to  see  if  the  user  has  opened  this  file  for 
access . ] 

We  may  recall,  that  a query  consists  of  a disjunction  of 
conjunctions,  where  each  query  conjunction  consists  of  a 
conjunction  of  keyword  predicates.  Index  terms,  from  SM, 
for  a query  are  determined  by  determining  the  index  terms 
for  each  query  conjunction.  The  index  terms  for  a query 
conjunction  are  determined  by  finding  the  set  of  index 
terms  Si,  for  every  keyword  predicate  Ki  in  the  conjunction, 
and  intersecting  all  the  sets  thus  formed.  For  exact  de- 
tails regarding  how  the  index  terms  are  obtained  and  inter- 
sected, the  reader  is  referred  to  [5],  The  set  of  index 
terms  is  stored  in  a set  I. 

We  may  recall,  that  each  index  term  consists  of  a pair, 
where  the  first  part  contains  a relative  cylinder  number, 
and  the  second  part  contains  a security  atom  identifier. 

Form  the  set  SA  which  consists  of  the  second  part  of  every 
element  in  I.  That  is,  extract  all  security  atom  identifiers 
from  index  terms.  Once  again,  the  reader  is  referred  to  [5] 
for  details  of  how  this  is  done  in  the  IXU. 

If  SA  is  non-empty,  then,  for  each  member  sa  of  SA,  do  Step 


5. 


Step  5:  Search  the  AAPL  for  this  user-file  combination,  looking  for 
an  entry  whose  first  field  is  sa.  If  no  such  entry  exists, 
then  invoke  Algorithm  J to  determine  the  access  privileges 
for  this  atom  and  to  make  a new  entry  in  the  AAPL  for  this 
atom.  Search  the  second  field  of  this  entry  to  determine 
if  the  retrieve  (or  read)  access  is  denied  on  the  requested 
attribute  combination  or  any  subset  of  it  (the  user  need 
not  ask  for  entire  records  in  a ’retrieve-by-query'  request.). 
If  it  is  denied,  then  delete  the  element  i in  set  I from 
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which  sa  was  derived. 

Step  6:  For  each  member  i of  I,  do  Step  7.  If  I is  empty,  then 
send  a reject  signal  to  the  front-end  computer  and  go  to 
Step  8. 

Step  7:  Convert  the  relative  cylinder  number  il,  which  is  the  first 
part  of  i,  to  an  absolute  cylinder  number  AC,  using  the  CAT. 
Let  i2  be  the  security  atom  identifier  stored  in  the  second 
part  of  i.  Now  issue  the  request  to  the  mass  memory  to  re- 
trieve, from  cylinder  AC,  all  records  (or  parts  of  records, 
depending  on  the  attribute  combination  specified  by  the  user) 
which  satisfy  the  given  query  and  which  are  part  of  security 
atom  i2. 


Step  8:  Deallocate  space  for  the  command,  and  delete  the  command 
from  DBCCP. 
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5.  AN  ILLUSTRATION  OF  THE  SECURITY  AND  CLUSTERING  MECHANISMS 

In  this  section,  we  shall  try  to  provide  a clearer  understanding  of  the 
security  and  clustering  processes  described  formally  in  earlier  sections,  by 
means  of  an  example.  Since  the  sample  database  of  the  example  must  be  small, 
the  reader  should  not  be  surprised  that  the  security  atoms  and  clusters  formed 
are  necessarily  very  small.  However,  it  is  hoped  that  the  example  should 
serve  the  useful  purpose  of  clarifying  the  details  of  the  security  and  clustering 
mechanisms.  The  section  is  divided  into  two  sub-sections.  In  the  first  of 
these,  we  illustrate  the  construction  of  the  various  tables  and  lists  during 
database  creation  time.  In  the  second  sub-section,  we  illustrate  how  DBC  deals 
with  queries  during  database  access  time. 

We  consider,  for  illustration,  a relational  database  with  two  relations 
EMPLOYEE  (Name,  Sal,  Los,  Pos,  Ht)  and  POSITION  (Pos,  Ma,  Mq)  , where  the 
column  names,  i.e.,  attributes  of  each  relation,  are  enclosed  in  parenthesis. 
Every  employee  record,  i.e.,  a row  in  relation  EMPLOYEE,  consists  of  an 
employee  Name,  an  employee  salary  Sal,  an.  employee's  length  of  service  Los, 
a job  position  Pos  and  an  employee  hometown  Ht.  Every  position  record, 
i.e.,  a row  in  relation  POSITION,  consists  of  a job  position  Pos,  a minimum 
age  Ma  and  a minimum  qualification  Mq.  The  relational  database  is  shown 
in  Figures  27a  and  27b. 

5 . 1 Table  and  List  Construction 

In  order  to  keep  the  discussion  simple,  the  assumption  has  been  made  that 
a cylinder  can  accomodate  up  to  five  records  (instead  of  some  realistic  size, 
such  as  500,000  bytes).  The  creator  U of  a file  ALPHA  begins  by  issuing  the 
' open-database-f ile-for-creation ' request.  In  this  request,  he  specifies 
that  file  ALPHA  has  eight  attributes,  that  it  will  occupy  not  more  than  six 
cylinders  initially,  and  that  it  may  be  expanded  later  on  to  two  additional 
cylinders.  DBC  executes  Algorithm  A in  response  to  this  request. 

Figure  28a  shows  a view  of  the  file  information  table  (FIT)  after  the 
execution  of  Algorithm  A.  The  first  field  indicates  that  the  file  has  been 
created  by  the  file  creator  who  has  an  identifier  of  1.  The  second  field  in 
the  FIT  indicates  that  the  file  has  been  opened  for  creation.  From  Field  3, 
we  see  that  eight  attributes  have  been  assigned  to  the  file.  The  identifiers 
of  these  attributes  are  1 through  8.  Fields  4 and  5 of  the  FIT  indicate, 
respectively,  that  six  cylinders  have  been  allocated  to  the  file  and  that  two 
more  cylinders  may  be  allocated  to  it  if  and  when  the  need  arises.  The  cylinder 
address  table  (CAT),  the  cylinder  bit  map  (CBM),  the  security  atom  name  bit 
map  (SANBM) , the  cluster  identifier  bit  map  (CIBM)  and  the  cylinder  space 
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FILE  ALPHA 
EMPLOYEE  RELATION 


Name 

Sal 

Los 

Pos 

HT 

Rl. 

Menon 

20,000 

6 

Asst.  Prof. 

Bombay 

R2. 

Hsiao 

20,000 

12 

Prof. 

Taipei 

R3. 

Kerr 

17,000 

12 

Lecturer 

Columbus 

R4. 

Lorenzo 

13,000 

6 

Junior  Lecturer 

Mexico  City 

R5. 

Jay 

24,000 

6 

Asst.  Prof. 

Calcutta 

R6. 

Lalit 

2,000 

6 

Student 

Bombay 

R7 . 

John 

37,000 

12 

Prof. 

Baltimore 

R8. 

Jack 

16,000 

9 

Lecturer 

New  York 

R9. 

Paul 

18,000 

9 

Lecturer 

Los  Angeles 

RIO. 

Mark 

12,000 

6 

Junior  Lecturer 

Goshen 

Rll . 

Jeff 

21,000 

6 

Asst.  Prof. 

Iowa 

R12. 

Chris 

24,000 

6 

Asst.  Prof. 

Iowa 

R13. 

Pramod 

22,000 

6 

Asst.  Prof. 

Lucknow 

R14. 

Munshi 

49,000 

12 

Pror . 

Kanpur 

FILE 

CREATOR:  U 

FIGURE  27a. 

Relation 

EMPLOYEE  of  Sample  Database 

I FILE 

ALPHA 

| CREATOR:  U 

| POSITION  RELATION 

| 

Pos 

Ma 

Mq 

R15 . 

Prof. 

45 

Ph.D. 

| R16. 

Asst.  Prof. 

40 

Ph.D. 

R17. 

Lecturer 

35 

M.S. 

R18. 

Junior  Lecturer 

30 

B.S. 

R19. 

Student 

20 

High  School  Degree 

FIGURE  27b.  Relation  POSITION  of  Sample  Database 
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table  (CST)  are  also  created  at  this  time,  and  pointers  to  these  are  entered 
in  Fields  6,  7,  8,  9 and  15,  respectively.  Field  16  indicates  that  the  file 
that  is  being  created  has  the  identifier  1. 

Figure  28b  shows  the  cylinder  space  table  (CST)  where  information 
regarding  the  availability  of  space  in  the  cylinders  allocated  to  the  file  is 
stored.  We  see,  from  the  CST,  that  the  cylinders  allocated  to  the  file  are 
1 through  6.  Since,  at  this  stage,  file  ALPHA  has  not  yet  been  loaded  into 
the  mass  memory  (MM)  of  DBC , all  the  six  cylinders  are  empty.  This  is 
indicated,  in  the  CST,  by  the  entries  in  the  second  fields.  All  these  entries 
are  five,  indicating  that  space  is  available  for  five  records  (which  is  the 
maximum  capacity  of  a cylinder)  in  each  of  these  cylinders. 

The  user  then  issues  the  'load-attribute-information'  request  followed  by 
the  'load-descriptors'  request.  In  the  former  request,  the  user  specifies 
that  the  primary  clustering  attribute  of  the  file  is  Relation,  and  the  secondary 
cl  istering  attributes  of  the  file,  in  order  of  importance,  are  Sal  and  Pos. 

The  primary  clustering  and  secondary  clustering  descriptors  specified  by  the 
user  in  the  latter  request  are  shown  in  Figures  29a  and  30a,  respectively. 

The  primary  clustering  descriptor  table  (PCDT)  and  the  secondary  clustering 
descriptor  table  (SCDT)  formed  therefrom  by  execution  of  Algorithm  C,  are 
shown  in  Figures  29b  and  30b,  respectively.  We  may  see,  from  the  last  field 
of  the  PCDT,  that  the  user  expects  all  the  records  in  the  EMPLOYEE  relation 
to  fit  into  four  cylinders,  and  all  the  records  in  the  POSITION  relation  to  fit 
into  two  cylinders.  We  recall,  from  our  discussion  in  an  earlier  section, 
that  if  a descriptor  is  of  type-B,  then  the  second  field  of  the  corresponding 
table  is  made  null.  This  is  clearly  shown  in  Figures  29b  and  30b.  Fields 
12  and  13  of  the  FIT  are  now  filled  up  with  pointers  to  these  two  newly  created 
tables.  At  this  point,  DBC  executes  Algoirthm  C'  to  form  clustering  keyword 
descriptors  and  store  them  in  the  keyword  descriptor  table  (KDT) . We  recall 
that  as  many  clustering  keyword  descriptors  are  formed  as  there  are  type-A 
descriptors.  Since  only  one  of  the  four  descriptors  is  of  type-A,  only  one 
clustering  keyword  is  formed.  This  is  shown  in  Figure  31. 

Next,  in  a 'load-access-control-information'  request,  the  database 
creator  specifies  that  four  users  are  allowed  to  access  file  ALPHA.  He  also 
specifies  seven  security  conjunctions  and  the  access  privileges  of  these  four 
users  on  records  in  file  ALPHA  satisfying  these  security  conjunctions.  One  of 
the  four  users  is  the  creator  of  the  file  himself  who  is  identified  as  1.  Let 
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Descriptor  id 
SCD1 . 

SCD2. 


Descriptor 

1,000  < Sal  < 30,000 
Position  = Asst.  Prof. 


Type  of  Descriptor 
Type-A 
Type-B 


FIGURE  30a.  The  Secondary  Clustering  Descriptors  of  the  File 


Lower 

Upper 

Secondary 

Clustering 

Attribute 

Limit 

Limit 

Descriptor 

Sal 

1000 

30000 

1 

Pos 

$ 

Asst.  Prof. 

2 

FIGURE  30b.  A View  of  the  SCDT  After  the 
'Load-Descriptors'  Command 


ATTRIBUTE 

Sal 


LOWER  LIMIT 

1,000 


UPPER  LIMIT 
30,000 


FIGURE  31.  The  One  Clustering  Keyword  Descriptor  Formed  for  the  File, 
as  Stored  in  the  Keyword  Descriptor  Table  (KDT) 
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us  assume  that  the  other  three  users  have  been  allocated  the  identifiers  2,  3 
and  4,  respectively.  DBC  now  creates  four  SITs,  one  for  each  of  these  four 
users.  The  SIT  corresponding  to  User  2 is  shown  in  Figure  32.  The  tenth 
field  in  the  FIT  is  now  filled  up  to  indicate  that'Jsers  1,  2,  3 and  4 are 
allowed  to  access  file  ALPHA.  Finally,  DBC  creates  security  keyword  des- 
criptors based  on  the  seven  security  conjunctions  specified  by  the  creator  by 
executing  Algorithm  D' . These  are  shown  in  Figure  33.  We  also  recall  that 
the  user  must  specify  a default  access  descriptor  as  part  of  this  request. 

Let  us  assume  that  he  specifies  the  default  access  descriptor  to  be  one  which 
allows  all  accesses. 

The  records  are  now  loaded  one  by  one.  In  representing  a relation  in 
the  DBC  database,  a relational  tuple  is  converted  to  a set  of  DBC  keywords, 
i.e.,  attribute-value  pairs,  and  a special  keyword  is  created  for  the  relation 
name.  For  example,  the  first  tuple  in  the  EMPLOYEE  relation  is  represented 
as  a DBC  record: 

(<Relation,  EMPLOYEE> , <Name , MENON>,<Sal,  20,000>, 

<Los , 6>,<Pos,  ASST.  PROF>,<Ht,  BOMBAY > ) . 

DBC  determines,  for  each  record,  the  security  atom  and  cluster  in 
which  it  belongs,  and  the  cylinder  in  which  the  record  is  to  be  placed.  While 
the  record  is  being  loaded  into  the  mass  memory  (MM) , the  directory  in  the 
structure  memory  (SM)  is  updated  to  reflect  the  presence  of  the  newly  inserted 
record.  The  directory  keywords  of  a record  are  specified  in  the  records 
themselves.  Let  us  assume  that  the  user  specifies  only  security  and  clustering 
keywords  to  be  directory  keywords. 

The  security  atom  definition  table  (SADT)  and  the  cluster  identifier 
definition  table  (CIDT)  are  also  formed  as  the  records  are  being  loaded.  Let 
us  illustrate  the  process  of  loading  a record  by  means  of  an  example.  Consider 
that  the  first  record  Rl,  as  shown  in  Figure  27a,  is  being  loaded.  First, 
the  security  atom  to  which  the  record  belongs  is  determined  by  identifying 
those  security  conjunctions  that  are  satisfied  by  the  record.  We  shall 
consider  the  seven  security  conjunctions  SI,  ...,  S7  in  turn.  The  first 
predicate  of  SI,  (Sal_>20 , 000) , is  satisfied  by  the  keyword  <Sal,  20,000>. 

However,  there  is  no  keyword  to  satisfy  the  second  predicate  of  SI  which  is 
(Pos=PR0FESS0R) . Thus,  the  record  does  not  satisfy  SI.  However,  the  record 
satisfies  S2,  because  the  keyword  <Sal,  20,000>  satisfies  the  predicate  (Sai<2D,00D) 
and  the  keyword  -Los,  6>  satisfies  the  predicate  (Los=6).  The  record  does  not 
satisfy  S3,  since  it  does  not  contain  any  keyword  to  satisfy  the  first  predicate 
of  S3.  The  record  does  satisfy  S4,  because  the  keyword  <Sal,  20,000>  satisfies 
the  predicate  (Sal  <_  30,000)  and  the  keyword  <Los,6>  satisfies  the  predicate 
(Los=6) . The  record  does  not  satisfy  S5,  because  n keyword  to  satisfy  the 
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SECURITY  INFORMATION  TABLE  (SIT) 


Security 

Conj unction 

Set  of  Access  Descriptors 

SI. 

(Sal  < 20,000)  A 

{([Name,  Sal],  No-Modify),  ([Name,  Los],  No-Read), 

(Pos  = Professor) 

([Name,  Pos,  Ht],  No-Read)} 

S2 . 

(Sa]  < 20,000)  A 
(Los  = 6) 

{([Name,  Los],  No-Modify)} 

S3. 

(Pos  = Professor)  A 
(Los  = 12) 

{([Name,  Sal],  No-Read),  ([Name,  Los],  No-Read), 

([Name,  Pos,  Ht],  No-Read)} 

S4. 

(Sal  < 30,000)  A 
(Los  = 6) 

{([Name,  Sal],  No-Read),  ([Name,  Los],  No-Read), 

([Name,  Pos,  Ht],  No-Read)} 

S5 . 

(Sa]  < 30,000)  A 

{([Name,  Los],  No-Read),  ([Name,  Pos,  Ht ] , No-Modify)]} 

(Los  = 9) 

S6. 

(Sal  < 1,000)  A 
(Los  = 12) 

{([Name,  Sal],  No-Read),  ([Name,  Los],  No-Read), 

([Name,  Pos,  Ht],  No-Read)} 

S7 . 

(Pos  = Professqr) 

{([Name,  Sal],  No-Read),  ([Name,  Los],  No-flodify), 

([Name,  Pos,  Ht],  No-Modify),  ([Pos,  Ma,  Mq], 
No-Modify) } 

FIGURE  32.  The  SIT  Corresponding  to  User  A (Identifier  2) 

on  File  ALpha  (File  id  1).  One  of  the  Four  SITs 
Formed  After  the  ' Load -Access-Control-Inf ormation ' 
Request . 
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FIGURE  33.  The  Security  Keyword  Descriptors  Formed  for 
the  File 
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predicate  (Los=9)  exists  in  the  record.  Similarly,  since  no  keyword  to  satisfy 
the  predicate  (Los-12)  exists  in  the  record,  S6  is  not  satisfied  by  the  record. 
Finally,  the  record  does  not  satisfy  S7  because  no  keyword  to  satisfy  (Pos= 
PROFESSOR)  exists  in  the  record.  Thus,  the  record  satisfies  security  conjunctions 
S2  and  S4,  and  is  part  of  Atom  1 which  is  defined  by  these  two  conjunctions. 

This  is  reflected  in  Figure  34. 

To  determine  the  cluster  to  which  the  record  belongs,  DBC  proceeds  as 
follows.  It  determines  the  primary  clustering  keyword  to  be  <Relation, 

EMPLOYEE>.  This  is  obviously  derivable  from  the  primary  clustering 
descriptor  PCD1  which  is  (Relat ion=EMPLOYEE) . DBC  then  determines  that  the 
record  has  two  secondary  clustering  keywords  <Sal,  20,000>  and  <Pos,  ASST. 

PROF>.  However,  the  former  is  more  important  and  is  the  one  that  shall  be 
considered  for  clustering  purposes.  We  see  that  <Sal,  20,000>  is  derivable 
from  SCD1  which  is  (l,00CKSal£30,000) . Thus,  the  cluster  the  record  belongs 
to  is  defined  by  PCD1  and  SCD1  and  this  is  Cluster  1. 

Finally,  DBC  must  determine  the  cylinder  into  which  the  record  must  be 
inserted.  Since  this  is  the  first  record  being  inserted,  DBC  arbitrarily 
chooses  to  insert  the  record  into  Cylinder  1. 

At  the  end  of  the  loading  process,  the  state  of  the  security  atom 
definition  table  (SADT) , the  cluster  identifier  definition  table  (CIDT)  and 
the  directory  in  SM  are  shown  in  Figures  34,  35,  and  36  respectively.  Each 
directory  entry  consists  of  a keyword  or  keyword  descriptor,  followed  by 
index  terms.  Each  index  term  is  a pair  of  the  form  (cylinder  number,  atom 
number) . 

Figures  37  and  38  clearly  show  the  formation  of  the  security  atoms  and  the 
clusters  in  terms  of  records.  Figure  39  indicates  which  records  have  gone  into 
which  cylinders.  The  final  form  of  the  CST  is  shown  in  Figure  40.  We  see 
from  the  CST  that  Cylinders  1 and  3 are  incapable  of  holding  any  more  records 
as  they  are  filled  to  capacity.  The  final  form  taken  by  the  FIT  is  shown  in 
Figure  41.  The  only  field  that  needs  explanation  is  Field  10  which  indicates 
that  four  users  with  identifiers  1,  2,  3 and  4 are  allowed  to  access  the  file. 

5. 2 Execution  of  Requests 

Before  a user  may  issue  any  access  command,  he  must  issue  an  'open- 
database-f ile-for-access ' request.  Consider  that  User  2 issues  the  above 
request.  At  this  stage,  the  atomic  access  privilege  list  (AAPL)  is  created. 

A view  of  User  2's  AAPL  is  shown  in  Figure  42. 

We  now  consider  the  execution  of  three  typical  access  requests  by  the 
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user. 


Keyword  or 
Keyword  Descriptor 


Index  Terms 


< Sal  < 1000 

— 

1000  < Sal  < 20000 

(1,1), (3,1) 

20000  < Sal  < 30000 

(1,1), (3,1), (4,1) 

30000  < Sal  < ® 

(2,2) 

< Los  < 6 

6 < Los  < 9 

(1,1), (3,1), (4,1) 

9 < Los  < 12 

(3,1) 

12  < Los 

(2,2), (1,1) 

<Relation,  EMPL0YEE> 

(1,1), (3,1), (4,1), (2, 2) 

<Relation,  POSITION^ 

(6, 4), (5, 3) 

<Pos,  PR0FESS0R> 

(2, 2), (5, 3) 

<Pos,  ASST.  PR0F#> 

(1,1), (3,1), (4,1) 



FIGURE  36.  A View  of  the  Directory  in  the  Structure 
Memory  After  all  the  Records  Have  Been 
Loaded 
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FIGURE  37.  A View  Showing  How  Records  Fit 
Into  the  Seven  Atoms 
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FIGURE  38.  A view  showing  how  records  are  grouped 
into  clusters. 
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FIGURE  39.  A view  showing  how  records  are  placed  in 
cylinders 


Example  1:  Read  the  length  of  service  of  all  employees  with  salaries  greater 
than  $30,000. 

This  is  an  example  where  the  physical  clustering  of  the  records 
becomes  very  useful  for  efficient  execution  of  the  request.  The 
request  requires  retrieval  of  all  records  that  satisfy  the  query 

( (Relation=EMPLOYEE)  & (Salj>30,000))  . 

The  index  terms  for  the  predicate  (Relation=EMPLOYEE)  are  obtained  from 
the  directory  keyword  <Relation,  EMPL0YEE>  (see  Figure  36).  The  index 
terms  for  (Sal>j30,000)  are  obtained  from  the  keyword  iescriptor 
(30, 000<Sal<°°) . The  two  sets  of  index  terms  are  now  intersected 
by  the  structure  memory  information  processor  (SMIP)  to  produce  the 
following  index  term: 

(2,2) 

We  note  that  although  there  are  many  records  satisfying  the  query, 
only  one  cylinder  contains  all  of  them.  This  is  because  of  clustering 
by  the  Relation  and  Sal  attributes.  Looking  at  the  AAPL  for  User  2, 
we  notice  that  the  'read'  access  is  allowed  on  the  Los  attribute  of 
records  in  Atom  2.  The  request  is  now  issued  to  the  mass  memory  (MM) 
to  retrieve  the  Los  attribute  from  records  in  Cylinder  1,  which  satisfy 
the  query 

(Relation= EMPLOYEE)  & (SaL>30, 000) 
and  which  are  in  Atom  2. 

Example  2:  Read  the  records  of  all  employees  who  have  worked  six  years. 

This  is  an  example  of  a request  where  the  response  set  is  not 
clustered.  The  request  requires  retrieval  of  all  records  that  satisfy 
the  query 

( (Relation=EMPL0YEE)  & (Los=6)). 

The  intersected  set  of  index  terms  for  the  query  is 

(1,1),  (3,1),  (4,1). 

Looking  at  the  AAPL  of  the  user,  we  note  that  entire  records  may  not 
be  read  from  Atom  1.  Thus,  the  request  is  rejected  by  DBC. 

Example  3:  Insert  the  record 

(<Relation,  POSITION,  <Pos,  SECRETARY > , <Ma,  20> , <Mq , HIGH  SCH00L>) 
into  file  ALPHA. 

DBC  first  determines  that  the  record  belongs  to  security  atom  0 
since,  it  does  not  satisfy  any  of  the  security  conjunctions.  From  the 
AAPL,  we  see  that  the  accesses  disallowed  on  all  attribute  combinations 
of  records  in  Atom  0 are  defined  in  the  default  access  descriptor. 

Since  the  default  access  descriptor  allows  all  accesses,  the  insert 
access  is  permitted.  The  primary  clustering  keyword  of  the  record  is 
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AAPL  for  User  A (User  Id2) 


Atom  ID 


0 

1 


Set  of  Field-Level 
Security  Descriptors 


Default  access  descriptor 

{([Name,  Sal],  No-Read),  ([Name,  Los], 
No-Read),  ([Name,  Pos,  Ht],  No-Read)} 

{([Name,  Sal],  No -Read ) , ([Name,  Los], 
No-Read),  ([Name,  Pos,  Ht],  No-Read), 
([Pos,  Ma,  Mq],  No-Modify)} 

{([Name,  Sal],  No-Read),  ([Name,  Los], 
No-Read),  ([Name,  Pos,  Ht],  No-Read)} 


{([Name,  Sal],  No-Read),  ([Name,  Los], 
No-Read),  ([Name,  Pos,  Ht],  No-Read)} 

{([Name,  Los],  No-Read),  ([Name,  Pos, 
Ht],  No-Modify)} 


{([Name,  Sal],  No-Read),  ([Name,  Los], 
No-Modify),  ([Name,  Pos,  Ht],  No-Modify), 
([Pos,  Ma,  Mq],  No  Modify)^ 


FIGURE  42.  Atomic  Access  Privilege  List  (AAPL)  for 
User  A on  File  Alpha 
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<Relation,  POSITION>  and  this  is  derivable  from  PCD2.  Its  only 
secondary  clustering  keyword  <Pos,  SECRETARY > is  not  derivable  from  any 
secondary  clustering  descriptor  and  is  hence  assumed  to  be  derived  from 
a null  secondary  clustering  descriptor.  Now,  it  is  determined  from  the 
CIDT  (figure  35),  that  the  record  belongs  to  Cluster  3,  and  that  all 
records  in  Cluster  3 are  placed  in  Cylinder  5.  The  CST  (Figure  AO) 
indicates  that  Cylinder  5 can  accommodate  one  more  record.  Hence,  the 
request  is  issued  to  MM  to  insert  the  given  record  into  Cylinder  5. 


I 
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6.  A CLOSER  LOOK  AT  THE  ARCHITECTURAL  ELEMENTS 

In  Chis  section,  we  discuss  the  architectural  elements  of  DBC  for 
access  control  and  data  clustering. 

Whenever  an  update  command  is  handled  in  DBC,  we  know  that  the  update 
entails  at  least  a retrieval  of  data  from  the  mass  memory  (MM) , followed  by 
an  insertion  of  updated  data  back  into  MM.  We  also  know  that  every  time  a 
record  is  inserted  into  MM,  DBC  must  determine  the  following  three  pieces  of 
information  about  it. 

(1)  The  security  atom  it  belongs  to. 

(2)  The  cluster  that  it  belongs  to. 

(3)  The  cylinder  in  which  it  must  be  placed. 

Obviously, speedy  determination  of  these  three  types  of  information  will 
improve  the  perf,. -mance  of  DBC  considerably.  Such  determination  is  carried 
out  by  Algorithms  F,  G1  and  G2  (See  Section  4).  These  algorithms  mostly 
involve  the  searching  of  various  tables.  Naturally,  these  searches  may  be 
speeded  up  by  the  use  of  parallel  techniques  rather  than  seuqential  ones,  and 
that  is  what  we  propose  to  examine. 

Looking  at  Algorithm  F,  we  see  that  the  security  information  table  (SIT) 
needs  to  be  searched  first.  However,  the  algorithm  demands  that  DBC  step 
through  every  entry  in  the  SIT.  Thus,  the  table  does  not  lend  itself 
to  parallel  search  hardware.  The  next  table  searched  by  tip's  algorithm  is 
the  security  atom  definition  table  (SADT) . Here,  the  algorithm  looks  for 
a particular  set  of  security  conjunction  identifiers  in  the  second  field 
of  an  SADT  entry.  This  is  just  the  kind  of  search  that  will  be  speeded  up 
by  the  use  of  parallel  search  hardware. 

Algorithm  G1  requires  DBC  to  search  the  primary  clustering  definition  table 
(PCDT)  and  the  secondary  clustering  definition  table  (SCDT) , for  entries  that 
have  particular  values  in  the  attribute  fields  of  these  two  tables.  That  is, 
given  an  attribute  identifier,  it  is  necessary  to  determine  all  the  entries  in 
PCDT  and  SCDT  that  have  this  identifier  as  their  first  field.  This  search 
will  obviously  be  speeded  up  if  done  in  parallel.  The  algorithm  also  needs  to 
search  the  cluster  identifier  definition  table  (CIDT) , to  look  for  an  entry  with 
a particular  pair  of  values  in  its  second  and  third  fields.  That  is,  the 
primary  clustering  descriptor  identifier  PID,  and  the  secondary  clustering 
descriptor  identifier  SID  (from  which  the  primary  clustering  keyword  of 
a record  and  the  most  important  secondary  clustering  keyword  of  the  same  record 
are  derivable)  are  first  determined.  It  is  now  necessary  to  search  CIDT  for  an 
entry  with  PID  in  Field  2 and  SID  in  Field  3.  This  is  also  the  kind  of  search 
that  may  be  more  speedily  done  in  parallel.  Similarly,  Algorithm  G2  should 
also  search  CIDT  in  a parallel  fashion. 
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The  only  table  searched  by  Algorithm  M (which  is  executed  in  response  to 
the  ' retrieve-by-query ' request)  is  the  atomic  access  privilege  list  (AAPL) . 
Given  a security  atom  identifier  sa,  it  is  necessary  to  search  AAPL  to 
determine  the  entry  which  contains  sa  in  its  first  field.  At  first  glance, 
it  seems  profitable  to  search  elements  of  AAPL  in  parallel.  However,  we 
notice  that  the  first  entry  in  the  AAPL  is  for  Atom  0,  the  second  entry 
is  for  Atom  1,  and  so  on.  Thus,  given  Atom  sa,  it  is  only  necessary  to  look 
at  the  (sa+l)-th  entry  of  AAPL  and  this  can  be  quite  easily  done  by  a single 
processor  with  a random  access  memory. 

Thus,  we  conclude  that  SALT,  CIDT,  PCDT  and  SCDT  should  utilize  parallel 
search  hardware.  The  remaining  tables,  with  the  exception  of  those  in  the 
index  translation  unit  (IXU),  are  palced  in  DBCCP.  The  parallel  search 
hardware  is  called  the  security  and  clustering  unit  (SCU)  which  has  the 
capability  of  searching  tables  in  parallel. 

Let  us  review  the  role  played  by  DBCCP.  DBCCP  regulates  the  operation 
of  both  the  structure  and  data  loops  and  interfaces  with  the  front-end 
computer  systems.  It  processes  all  DBC  commands  received  from  the  front-end 
computer  systems,  schedules  their  execution  based  on  the  command  type  and 
priority,  and  routes  the  response  set  to  the  front-end  computer  systems. 
Additionally,  it  makes  use  of  the  parallel  searching  capabilities  of  SCU  to 
search  the  tables  needed  for  the  enforcement  of  security  and  the  clustering 
of  records. 

Architecturally  and  technologically,  SCU  is  organized  in  a manner  similar 
to  the  structure  memory  (SM) . SCU  also  consists  of  a number  of  processing 
units.  Each  unit  is  associated  with  its  own  blocks  of  memory.  Tables  to  be 
searched  in  SCU  are  placed  in  such  a manner  that  they  are  spread  evenly  over 
the  memory  blocks  of  all  the  processing  units.  Thus,  in  searching  a parti- 
cular table,  each  processing  unit  needs  to  search  only  that  part  of  the  table 
that  is  in  its  own  memory  block.  Since  all  processing  units  search  simul- 
taneously, the  table  is  in  effect  being  searched  in  parallel. 

On  the  other  hand,  these  tables  could  have  been  included  in  the  structure 
memory  (SM) , since  it  also  has  parallel  search  capabilities,  but  this  is 
not  done  for  two  important  reasons: 

(1)  It  is  prefereable,  for  the  sake  of  improved  reliability  and  security, 
to  preserve  the  functional  specialization  of  each  component  of  DBC. 

SM  is  really  an  index  to,  and  a condensed  image  of,  the  mass  memory 
(MM) . If  it  were  to  be  used  for  security  enforcement  or  data 
clustering,  this  would  no  longer  be  true. 


(2)  It  is  advantageous  to  assign  SM  only  a limited  set  of  functions 

in  order  that  it  does  not  become  a bottleneck  and  impair  the  pipe- 
lining achievable  in  DBC.  To  elaborate  this  point,  let  us  consider 
the  processing  necessary  in  preparation  for  the  simultaneous  exe- 
cution of  a 'retrieve-by-query'  request  and  an  'insert-record'  request. 
To  satisfy  the  former  request,  it  is  necessary  to  search  SM  and 
retrieve  index  terms.  For  inserting  a new  record,  on  the  other  hand, 
it  is  necessary  to  determine  the  security  atom  and  cluster  the 
record  belongs  to  by  searching  tables  stored  in  SCU.  Since,  index 
terms  are  stored  in  SM  separately  from  the  security  and  clustering 
related  tables  which  are  in  SCU,  both  these  searches  may  be  conducted 
concurrently.  In  contrast,  if  the  tables  for  determining  security 
atom  numbers  were  stored  in  SM,  the  searches  would  have  to  be 
conducted  serially,  one  after  another. 

Figure  43  is  a schematic  of  the  refined  architecture  of  DBC.  We  note  that 
the  security  filter  processor  (SFP)  is  composed  of  two  components,  SCU  and  the 
post  processor  (PP). 

To  understand  the  role  played  by  PP,  let  us  briefly  recapitulate  the 
sequence  of  events  that  occur  after  a user  issues  a 'retrieve-by-query'  request. 
First,  index  terms  corresponding  to  the  query  conjunctions  of  the  query  are 
retrieved  from  the  structure  memory  (SM)  and  intersected  in  the  structure 
memory  information  processor  (SMIP) . The  index  terms  contain,  among  other 
things,  the  security  atoms  which  could  possibly  contain  records  satisfying  the 
user's  query.  The  atomic  access  privilege  list  (AAPL)  of  this  user  is  now 
accessed  in  order  to  check  the  allowed  accesses  on  records  in  the  security 
atoms  obtained  from  the  index  terms.  Let  us  assume  that  the  user  does  not 
wish  to  read  entire  records.  Rather,  he  wishes  to  read  only  two  attributes, 

Name  and  Salary,  from  records  satisfying  his  query.  The  AAPL  will  clearly 
indicate,  for  the  records  in  the  relevant  security  atoms,  whether  'read' 
access  on  the  above  combination  of  attributes  is  allowed  or  not.  The  list  of 
security  atoms  on  whose  records  such  access  is  allowed  is  now  sent  on  to  the 
mass  memory  (MM)  along  with  the  cylinder  numbers  obtained  from  the  index  terms, 
the  user's  query  and  the  attribute  combination  [Name,  Salary]  required  to  be 
read . 

The  mass  memory  (MM)  now  retrieves  the  relevant  records,  which  satisfy 
the  security  requirements  and  the  user's  query,  in  their  entirety.  These 
entire  records  are  now  passed  on  to  the  post  processor  (PP)  along  with  the 
attribute  combination  [Name,  Salary]  required  to  be  read  for  the  user.  PP 
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extracts  the  relevant  fields  from  the  retrieved  records  and  passes  these  extracted 
fields  back  to  DBCCP  which  then  passes  them  to  the  front-end  computer.  Before 
passing  the  extracted  fields  to  DBCCP,  PP  may  sort  the  retrieved  information  on 
some  attribute.  Thus,  PP  performs  the  functions  of  field  extraction  and  sorting. 
PP  may  also  be  used  to  perform  the  (relational)  join  operation. 

6.1  Certain  Restrictions  Placed  on  the  User 

At  this  point,  we  wish  to  point  out  certain  conditions  placed  on  a user  of 
DBC.  Suppose  that  a user  wishes  to  access  records  which  have  the  attributes: 

Name,  Salary  and  Hometown  present  in  their  keywords.  Also,  let  us  assume  that 
the  user  wishes  to  read  only  the  first  two  of  these  three  fields  from  such 
records.  We  require  that  the  'retrieve-by-query'  request  received  by  DBC 
contains  a list  of  fields  to  be  read,  rather  than  a list  of  fields  that  need 
not  be  read.  Thus,  in  the  above  example,  the  request  specifies  'read  Name  and 
Salary',  rather  than  'read  all  fields  except  Hometown'.  The  reason  for  this 
should  be  immediately  apparent  to  the  reader.  If  the  request  is  of  the  former 
kind,  the  AAPL  may  be  checked  a priori  to  see  if  read  access  on  the  combination 
[Name,  Salary]  is  disallowed.  However,  if  the  request  is  of  the  latter  kind, 
this  a priori  checking  cannot  be  done.  In  this  case,  records  satisfying  the 
user's  query  must  first  be  retrieved  from  MM.  Then,  the  fields  in  the  records 
must  be  examined.  The  AAPL  must  now  be  checked  to  see  if  any  field  or 
combination  of  fields  in  the  records  retrieved  may  not  be  read.  Thus,  costly  a 
posteriori  checking  needs  to  be  done  and  this  must  be  avoided  as  it  leads  to 
access  imprecision.  Hence,  we  require  the  user  to  specify  a list  of  fields  to 
be  read  rather  than  a list  of  fields  that  need  not  be  read.  The  only  time  that 
the  user  need  not  specify  a list  of  fields  to  be  read  is  when  he  wishes  to 
read  entire  records.  In  this  case,  it  is  only  necessary  to  check,  from  the  AAPL, 
if  'read'  access  is  disallowed  on  any  attribute  combination.  This  may  be  done 
a priori. 

DBC  also  does  not  attempt  to  output  records  which  may  in  some  sense 
partially  satisfy  a query.  Let  us  explain  this  by  means  of  an  example.  Once 
again,  consider  that  records  consist  of  three  attributes:  Name,  Salary  and 
Hometown.  Also,  let  us  assume  that  'read'  access  is  denied  for  a particular 
user  on  this  combination  of  these  thrr e fields.  If  this  user  requests  to  read 
entire  records,  this  request  certainly  cannot  be  allowed.  However,  he  may 
read  either  of  the  six  combinations  listed  below. 
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1.  [Name,  Salary]. 

2.  [Name,  Hometown], 

3.  [Salary,  Hometown]. 

4 . [ Name ] . 

5.  [Salary]. 

6.  [Hometown]. 

Thus,  when  the  user  requests  entire  records,  DBC  has  the  choice  of  either 
denying  the  request,  or  sending  to  him  one  of  the  six  field  combinations  listed 
above.  Rather  than  make  such  an  arbitrary  choice,  DBC  will  reject  the  request. 


6 . 2 Adding  and  Changing  Security  Requirements 

It  is  possible  that,  from  time  to  time,  the  file  creator  would  like  to 
make  certain  changes  to  his  file.  Perhaps,  the  most  common  modification 
requested  by  a file  creator  is  the  addition  of  one  more  user  to  the  list  of 
users  allowed  to  access  his  file.  The  file  creator  also  needs  to  specify,  for 
this  new  user,  a list  of  field-level  security  descriptors,  one  such  list  for  each 
security  conjunction  of  the  file. 

Pictorially,  the  situations  before  and  after  the  addition  of  a new  user 
(user  3)  are  shown  in  Figures  44  and  45.  In  Figure  45,  P31,  ...,  P3n  are  the 
lists  of  field-level  security  descriptors  specified  for  the  new  user.  DBC  easily 
responds  to  this  request  by  creating  a new  AAPL  for  this  user  and  storing  it  in 
DBCCP . 

A file  creator  may  also  wish  to  modify  the  security  conjunctions  of  his 
file.  For  example,  he  may  wish  to  add  another  security  conjunction  SC.  This, 
as  we  know,  could  potentially  double  the  number  of  existing  security  atoms. 

Each  existing  security  atom  could  be  split  into  two  parts.  One  part  would 
contain  records  that  satisfy  SC  and  the  other  part  would  contain  records  that  do 
hot  satisfy  SC.  Such  an  addition  on  the  part  of  the  file  creator  would 
require  a major  operation  on  the  part  of  DBC.  Each  record  in  the  file  must  be 
read  from  MM,  examined  to  see  whether  it  satisfies  the  new  security  conjunction, 
assigned  to  a new  security  atom  and  reinserted  into  MM.  All  the  old  directory 
entries  have  to  be  deleted  and  new  ones  created  during  the  reinsertion  process, 
since  the  assignment  of  records  to  security  atoms  has  now  been  changed.  Along 
with  the  new  set  of  security  conjunctions,  the  file  creator  also  needs  to 
specify  the  protection  requirements  for  each  user  of  his  file  on  records 
satisfying  these  security  conjunctions.  Thus,  DBC  needs  to  delete  the  old 
security  information  tables  and  create  new  ones.  DBC  also  needs  to  create  new 
AAPLs  for  the  various  users  of  the  file. 
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DBC  handles  the  above  operations  in  the  following  manner.  It  first  copies 
the  file  in  an  auxiliary  store,  then  pretends  to  have  received  a 'delete-file' 
request,  followed  by  the  'open-database-f ile-for-creation' , ' load-attributes- 
information' , 'load-descriptors',  'load-access-control-information'  and  'close- 
database-file'  requests,  and  executes  the  necessary  algorithms.  That  is, 

DBC  looks  upon  the  process  as  if  the  user  has  deleted  the  file  and  created  a 
new  one.  It  is,  of  course,  hoped  that  the  file  creator  changes  the  security 
predicates  of  the  file  infrequently.  Since  security  predicates  partition 
the  file  into  atoms,  new  or  changed  predicates  require  re-partition  of  the 
entire  file  which  is  both  time-consuming  and  costly. 

6.3  Some  Miscellaneous  Information  in  the  Security  and  Clustering  Unit  (SCU) 

SCU  is  used  by  DBCCP  whenever  records  have  to  be  loaded  into  MM  of  DBC. 
DBCCP  does  not  need  to  use  SCU  in  order  to  handle  'retrieve-by-query'  requests. 

We  recall  that  SCU  stores  four  different  kinds  of  tables.  In  order  to 
differentiate  among  the  different  kinds  of  tables,  an  additional  field  is 
created  in  each  entry  of  the  various  tables.  This  field  is  the  table  identifier 
field,  and  is  made  to  be  the  first  field  of  every  entry.  If  the  entry  is  an 
entry  in  the  SADT,  then  this  field  is  made  1.  If  the  entry  is  in  the  PCDT,  this 
field  is  made  2.  If  the  entry  is  in  the  SCDT,  this  field  is  made  3,  and  if  the 
entry  is  in  the  CIDT,  this  field  is  made  4.  Thus,  the  processors  of  SCU  can 
identify  the  type  of  table  by  reading  this  first  field. 

Also,  tables  belonging  to  all  the  files  known  to  DBC  are  stored  in  SCP. 

To  show  the  correspondence  between  the  tables  and  files,  another  field  is 
incorporated  with  each  entry  of  every  table.  This  field  (which  is  made  to  be 
the  second  field  in  all  entries  of  all  tables)  stores  the  file  identifier  of 
the  file  for  which  the  table  has  been  created.  Thus,  for  example,  an  entry 
in  the  SADT  will  now  have  four  fields,  instead  of  two  as  indicated  in  earlier 
sections. 
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7.  CONCLUDING  REMARKS 

In  the  preceding  sections,  we  have  endeavored  to  present  a complete 
picture  of  the  mechanisms  of  security  enforcement  and  data  clustering  in  DBC. 
This  is  because  we  wish  to  tie  up  a lot  of  loose  ends  from  previous  technical 
reports  and  also  because  we  present  certain  refinements  in  the  architecture. 

With  respect  to  security,  the  most  important  concept  is  that  of  security 
atoms  due  to  McCauley  [81.  However,  we  have  expanded  the  concept  in  order  to 
allow  for  protection  of  records  at  the  field  (or  attribute)  level.  All  security 
checks  are  made  prior  to  the  retrieval,  of  records,  so  that  no  access  imprecision 
results. 

DBC  design  exploits  both  existing  and  emerging  technologies.  The  on-line 
mass  memory  (MM)  is  made  from  moving-head  disks,  perhaps  the  least  expensive 
of  all  large-capacity  on-line  storage  devices.  The  disks,  however,  are  modified 
to  allow  parallel  read-out  of  an  entire  cylinder  in  one  revolution  time, 
instead  of  one  track  at  a time.  It  seems  adequate  that  access  is  limited  to 
one  or  a few  cylinders,  since  single  user  transactions  seldom  refer  to  data 
beyond  megabytes  in  size.  As  long  as  data  is  not  physically  scattered, 
sweeping  of  a large  number  of  disk  cylinders  can  be  avoided.  The  physical 
dispersion  of  related  data  is  prevented  by  a built-in  clustering  mechanism  which 
uses  information  provided  by  the  creator  of  the  database. 

The  algorithms  for  supporting  these  have  been  discussed  in  great  depth. 

The  post  processor  (PP)  will  be  discussed  in  a forthcoming  report.  The  PP 
allows  groups  of  retrieved  records  to  be  sorted  on  the  basis  of  some  attribute, 
or  joined  with  other  groups  of  records  (a  join  being  assumed  to  mean  the  same 
as  a relational  equality  join).  Furthermore,  it  performs  field  extraction  for 
field-level  protection. 
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